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********** We | Come to STN International ********** 

NEWS 1 Web Page URLs for STN Seminar Schedule - N. America 
NEWS 2 "Ask CAS" for self-help around the dock 
NEWS 3 JAN 17 Pre-1988 INPI data added to MARPAT 
NEWS 4 FEB 21 STN AnaVist, Version 1.1, lets you share your STN AnaVist 
visualization results 

NEWS 5 FEB 22 The IPC thesaurus added to additional patent databases on STN 
NEWS 6 FEB 22 Updates in EPFULL; IPC 8 enhancements added 
NEWS 7 FEB 27 New STN AnaVist pricing effective March 1, 2006 
NEWS 8 MAR 03 Updates in PATDPA; addition of IPC 8 data without attributes 
NEWS 9 MAR 22 EMBASE is now updated on a daily basis 
NEWS 10 APR 03 New IPC 8 fields and IPC thesaurus added to PATDPAFULL 
NEWS 11 APR 03 Bibliographic data updates resume; new IPC 8 fields and IPC 
thesaurus added in PCTFU LI- 
NEWS 12 APR 04 STN AnaVist $500 visualization usage credit offered 
NEWS 13 APR 12 LINSPEC, learning database for INSPEC, reloaded and enhanced 
NEWS 14 APR 12 Improved structure highlighting in FQHITand QHIT display 
in MARPAT 

NEWS 15 APR 12 Derwent World Patents Index to be reloaded and enhanced during 
second quarter; strategies may be affected 

NEWS 16 MAY 10 CA/CAplus enhanced with 1900-1906 U.S. patent records 
NEWS 17 MAY 11 KOREAPAT updates resume 

NEWS 18 MAY 19 Derwent World Patents Index to be reloaded and enhanced 
NEWS 19 MAY 30 IPC 8 Rolled-up Core codes added to CA/CAplus and 
USPATFULL/USPAT2 

NEWS 20 MAY 30 The F-Term thesaurus is now available in CA/CAplus 
NEWS 21 JUN 02 The first reclassification of IPC codes now complete in 
INPADOC 

NEWS EXPRESS JUNE 16 CURRENT WINDOWS VERSION IS V8.01b, CURRENT 
MACINTOSH VERSION IS V6.0c(ENG) AND V6.0Jc(JP), AND CURRENT 

DISCOVER FILE IS DATED 23 MAY 2006. 

NEWS HOURS STN Operating Hours Plus Help Desk Availability 
NEWS LOGIN Welcome Banner and News Items 

NEWS IPC8 For general information regarding STN implementation of IPC 8 
NEWS X25 X.25 communication option no longer available after June 2006 

Enter NEWS followed by the item number or name to see news on that 
specific topic. 

All use of STN is subject to the provisions of the STN Customer 
agreement. Please note that this agreement limits use to scientific 
research. Use for software development or design or implementation 
of commercial gateways or other similar uses is prohibited and may 
result in loss of user privileges and other penalties. 

************** g-rN coiumpus *************** 
FILE 'HOME' ENTERED AT 18:48:15 ON 19 JUN 2006 
=> file caplus 

COST IN U.S. DOLLARS SINCE FILE TOTAL 

ENTRY SESSION 

FULL ESTIMATED COST 0.21 0.21 

FILE 'CAPLUS' ENTERED AT 18:48:28 ON 19 JUN 2006 

USE IS SUBJECT TO THE TERMS OF YOUR STN CUSTOMER AGREEMENT. 

PLEASE SEE "HELP USAGETERMS" FOR DETAILS. 

COPYRIGHT (C) 2006 AMERICAN CHEMICAL SOCIETY (ACS) 



Copyright of the articles to which records in this database refer is 
held by the publishers listed in the PUBLISHER (PB) field (available 
for records published or updated in Chemical Abstracts after December 
26, 1996), unless otherwise indicated in the original publications. 
The CA Lexicon is the copyrighted intellectual property of the 
American Chemical Society and is provided to assist you in searching 
databases on STN. Any dissemination, distribution, copying, or storing 
of this information, without the prior written consent of CAS, is 
strictly prohibited. 

FILE COVERS 1907 - 19 Jun 2006 VOL 144 ISS 26 
FILE LAST UPDATED: 18 Jun 2006 (20060618/ED) 

Effective October 17, 2005, revised CAS Information Use Policies apply. 
They are available for your review at: 

http://www.cas.org/infopolicy.html 

=*> s (simulat?(10a)(microarray# or (micro(w)array#))/bi,ab 
UNMATCHED LEFT PARENTHESIS '(SIMULAT?' 
The number of right parentheses in a query must be equal to the 
number of left parentheses. 

=> s (simulat?(10a)(microarray# or (micro(w)array#)))/bi,ab 746182 SIMULAT?/ BI 
427790 SIMULAT7/AB 40727 MICROARRAY#/BI 22794 
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145099 ARRAY#/BI 134850 ARRAY#/AB 

LI 341 (SIMULAT?(10A)(MICROARRAY# OR (MICRO(W)ARRAY#)))/BI,AB 

=> s (computer?(10a)(microarray# or (micro(w)array#)))/bi,ab 293456 
COMPUTER7/BI 207478 COMPUTER7/AB 40727 

MICROARRAY#/BI 22794 MICROARRAY#/AB 154958 

MICRO/BI 93105 MICRO/AB 145099 ARRAY#/BI 

134850 ARRAY#/AB 

L2 641 (COMPUTER?(10A)(MICROARRAY# OR (MICRO(W) ARRAY* )))/BI,AB 

=> s II or 12 

L3 929 LI OR L2 

=> s 13 not 2006/py 572333 2006/PY 
L4 859 L3 NOT 2006/PY 

=> s 14 not 2005/py 1278142 2005/PY 
L5 611 L4 NOT 2005/PY 

=> s 15 not 2004/py 1291977 2004/PY 
L6 407 L5 NOT 2004/PY 

=> d his 

(FILE 'HOME' ENTERED AT 18:48:15 ON 19 JUN 2006) 
FILE 'CAPLUS' ENTERED AT 18:48:28 ON 19 JUN 2006 

LI 341 S (SIMULAT?(10A)(MICROARRAY# OR (MICRO(W)ARRAY#)))/BI,AB 

L2 641 S (COMPUTER?(10A)(MICROARRAY# OR (MICRO(W)ARRAY#)))/BI,AB 

L3 929 S LI OR L2 

L4 859 S L3 NOT 2006/PY 

L5 611 SL4 NOT 2005/PY 

L6 407 S L5 NOT 2004/PY 



=> d 16 1-407 bib ab 

L6 ANSWER 1 OF 407 CAPLUS COPYRIGHT 2006 ACS on STN 
AN 2005:629351 CAPLUS 
DN 143:147283 

TI Optimal design of single factor cDNA microarray experiments and mixed models for 
gene expression data 
AU Yang, Xiao 

CS Virginia Polytechnic Institute and State Univ., Blacksburg, VA, USA 

SO (2003) 98 pp. Avail.: UMI, Order No. DA3141112 From: Diss. Abstr. Int., B 2005, 

65(7), 3529 

DT Dissertation 

LA English 

AB Unavailable 

L6 ANSWER 2 OF 407 CAPLUS COPYRIGHT 2006 ACS on STN 
AN 2005:481472 CAPLUS 
DN 143:127766 

TI Data analysis tools for DNA microarrays 
AU Draghiei, Sorin 
CS USA 

SO (2003) Publisher: (Chapman & Hall/CRC, Boca Raton, Fla.), 512 pp. ISBN: 1- 

58488-315-4 

DT Book 

LA English 

AB Unavailable 

L6 ANSWER 3 OF 407 CAPLUS COPYRIGHT 2006 ACS on STN 
AN 2005:213730 CAPLUS 
DN 143:54373 

TI DNA Microarrays and Gene Expression: From Experience to Data Analysis and 
Modeling 

AU Baldi, Pierre; Hatfield, G. Wesley 
CS UK 

SO (2002) Publisher: (Cambridge University Press, Cambridge, UK), 200 pp. ISBN: 0- 

521-80022-6 

DT Book 

LA English 

AB Unavailable 

L6 ANSWER 4 OF 407 CAPLUS COPYRIGHT 2006 ACS on STN 
AN 2005:131600 CAPLUS 
DN 143:24559 

TI Effects of CpG oligodeoxynucleotide on gene expression in immunocytes 
AU Hu, Zhenlin; Wan, Bin; Zhou, Fengjuan; Wang, Jing; Wang, Qingmin; Sun, Shuhan 
CS Department of Medical Genetics, College of Basic Medical Sciences, Second Military 
Medical University, Shanghai, 200433, Peop. Rep. China 

SO- Dier Junyi Daxue Xuebao (2003), 24(10), 1086-1089 CODE N: DJXUE5; ISSN: 
0258-879X 

PB Dier Junyi Daxue Xuebao Bianjibu 
DT Journal 
LA Chinese 

AB The influence of CpG oligodeoxynucleotide (CpG-ODN) on the gene expression in 
immunocytes and Its mechanisms were studied. RAW264.7 cells were stimulated with 
CpG-ODN for 6 h, and mRNA from both control and treated cells were isolated and 
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purified, then were reversely transcribed to cDNA with the incorporation of fluorescent- 
labeled dirrp to prep, the hybridization probes. The mixed probes were then 
hybridized to the cDNA microarray MGEC-80s. After high-stringent washing, the cDNA 
***microarray*** was scanned by ***computer*** system and the differently 
expressed genes were obtained. A total of 119 differently expressed genes were 
detected after CpG-ODN stimulation, of which 74 were up-regulated and 45 were 
down-regulated. These genes were related to cell cyde, immune modulation, lipid 
metab., foam cell formation, and signal transduction. CpG-UDN may widely modulate 
the expression of many genes in immune cells, and further anal, of related genes may 
help understand the mol. mechanisms of CpG-ODN. 

L6 ANSWER 5 OF 407 CAPLUS COPYRIGHT 2006 ACS on STN 
AN 2004:980924 CAPLUS 
DN 142:149687 

TI Arrays of oligonucleotide probes for digital recognition by computer 

IN Jin, Dong Gyu 

PA Cosmogenome Inc., S. Korea 

SO Repub. Korean Kongkae Taeho Kongbo, No pp. given CODEN: KRXXA7 
DT Patent 
LA Korean 

FAN.CNT 1 PATENT NO. KIND DATE APPLICATION NO. DATE 

PI KR 2003060907 A 20030716 KR 2003-705129 20030411 
PRAI KR 2003-705129 20030411 

AB Arrays of oligonucleotide probes for digital recognition by a computer are 
provided, thereby easily and rapidly interpreting the anal, results of dots in a chip. An 
array of oligonucleotide, probes on a solid support for detecting the point mutation of 
testee's DNA sample comprises (i) a labeling part of arrays including catalog no., gene 
sequence no., ID no., command and IP address, which indicates information for sample 
DNA identification to be read by a computer; and (ii) a logic part of arrays including 
arrays of probes in 4 columns in at least 100 up to 100,000 rows, wherein each column 
consists of 2 symbols, that are, a control symbol having a detectible marker for digital 
recognition by a computer and hybridization symbol comprising oligonucleotide probes 
in the 5 to 30 nucleotides length occupying known sites by substituting target 
oligonucleotide into ACGT in each column. 

L6 ANSWER 6 OF 407 CAPLUS COPYRIGHT 2006 ACS on STN 
AN 2004:947551 CAPLUS 
DN 143:282023 

TI Computer simulation system of DNA-binding protein experiment based on dsDNA 
microassay 

AU Xie, Jianming; Bai, Yunfei; Qian, Lulu; Cui, Lei; Sun, Xiao; Lu, Zuhong 

CS Chien-Shiung Wu Laboratory, Southeast University, Nanjing, 210096, Peop. Rep. 

China 

SO Shengwu Wuli Xuebao (2003), 19(2), 156-160 CODEN: SWXUEN; ISSN: 1000- 
6737 

PB Shengwu Wuli Xuebao Bianjibu 
DT Journal 
LA Chinese 

AB D5DNA (double-stranded DNA) microarray, as a novel high-throughout technique, 
get a start in the field of DNA-binding protein research. It studied a dsDNA probe 
designing method for a new fabrication technol. of the dsDNA microarray. T. Hen a 
computer software named 'DBP' was introduced, which ***simulated*** the 
procedures of DNA-binding protein expt. based on dsDNA ***microarray*** and 
included DNA digestion using restriction enzymes, electrophoresis, hybridization and 
data management. Using DBP software, it can design dsDNA probes give advice on 
expt. planning and predictive results of a DNA-binding protein expt. 

L6 ANSWER 7 OF 407 CAPLUS COPYRIGHT 2006 ACS on STN 
AN 2004:938025 CAPLUS 
DN 142:211570 

TI Effect of .beta. -carotene on gene expression of breast cancer cells 

AU Li, Zhong; Hu, Chunyan; Mo, Baoqing; Xu, Jida; Zhao, Yan 

CS Department of Nutrition and Food Science, Nanjing Medical University, Nanjing, 

Jiangsu Province, 210029, Peop. Rep. China 

SO Aizheng (2003), 22(4), 380-384 CODEN: AIZHE4; ISSN: 1000-467X 
PB Sun Yat-sen Daxue, Aizheng Zhongxin 
DT Journal 
LA Chinese 

AB Study the altered gene expression of MCF-7 cell before and after the treatment 
with .beta. -carotene using cDNA microarray and the mechanism that .beta. -carotene 
induce breast cancer cell apoptosis. Two fluorescence cDNA probes were made using 
reverse transcription reaction from mRNA of .beta.-carotene untreated or treated MCF- 
7 cells (human estrogen receptor pos. breast cancer cells), marked with two different 
fluorescence dyes (cy3 and cy5) resp., hybridized with expressed cDNA 
***microarray*** scanned and analyzed by * "computer*** system and finally 
the expressed gene was produced. A total of 21 genes related to cell apoptosis, cell 
signal transduction, protein translation and immunity were expressed differently after 
the treatment of .beta.-carotene, which 3/21 were up-regulated (AF040958, AK001555, 
g41894), 18/21 were down-regulated (hshsp90r, U83857, AB014509, AF126028, 
AF053641, AF1 17386, AF050127, NMJH2177, humtopi, AJ250915, U37547, U78798, 
NM.004849, NM.005346, af004711, NM.006595, NM.001418, AB015051). The 
results suggested that .beta.-carotene might inhibit the growth of breast cancer cells 
through inducing apoptosis, breaking signal transduction, and blocking protein 
translation. 

L6 ANSWER 8 OF 407 CAPLUS COPYRIGHT 2006 ACS on STN 
AN 2004:800698 CAPLUS 
DN 142:149649 



TI Multidass dassification of microarray data with repeated measurements: 
application to cancer 

AU Yeung, Ka Yee; Bumgamer, Roger E. 

CS Department of Microbiology, University of Washington, Seattle, WA, 98195, USA 

SO GenomeBiology (2003), 4(12), No pp. given CODEN: GNBLFW; ISSN: 1465-6914 

URL: http://genomebiology.com/content/pdf/gb-2003-4-12-r83.pdf 

PB BioMed Central Ltd. 

DT Journal; (online computer file) 

LA English 

AB Prediction of the diagnostic category of a tissue sample from its gene-expression 
profile and selection of relevant genes for dass prediction have important applications 
in cancer research. We have developed the uncorrected shrunken centroid (USC) and 
error-weighted, uncorrected shrunken centroid (EWUSC) algorithms that are applicable 
to microarray data with any no. of classes. We show that removing highly correlated 
genes typically improves dassification results using a small set of genes. 
RE.CNT 36 THERE ARE 36 CITED REFERENCES AVAILABLE FOR THIS RECORD 
ALL CITATIONS AVAILABLE IN THE RE FORMAT 

L6 ANSWER 9 OF 407 CAPLUS COPYRIGHT 2006 ACS on STN 
AN 2004:800097 CAPLUS 
DN 142:149274 

TI Application of independent component analysis to microarrays 
AU Lee, Su-In; Batzoglou, Serafim 

CS Dep. Electrical Eng., Stanford Univ., Stanford, CA, 94305-9010, USA 

SO GenomeBiology (2003), 4(11), No pp. given CODEN: GNBLFW; ISSN: 1465-6914 

URL: http://genomebiology.com/content/pdf/gb-2003-4-l l-r76.pdf 

PB BioMed Central Ltd. 

DT Journal; (online computer File) 

LA English 

AB We apply linear and nonlinear independent component anal. (ICA) to project 
microarray data into statistically independent components that correspond to putative 
biol. processes, and to cluster genes according to over- or under-expression in each 
component. We test the statistical significance of enrichment of gene annotations 
within clusters. ICA outperforms other leading methods, such as principal component 
anal., k- means clustering and the Plaid model, in constructing functionally coherent 
clusters on microarray datasets from Saccharomyces cerevisiae, Caenorhabditis elegans 
and human. 

RE.CNT 75 THERE ARE 75 CITED REFERENCES AVAILABLE FOR THIS RECORD 
ALL CITATIONS AVAILABLE IN THE RE FORMAT 

L6 ANSWER 10 OF 407 CAPLUS COPYRIGHT 2006 ACS on STN 
AN 2004:719122 CAPLUS 
DN 141:201243 

TI Visualization of gene expression data - the GE-biplot, the chip-plot and the gene- 
plot 

AU Pittelkow, Yvonne E.; Wilson, Susan R. 
CS Australian Natl. Univ., Australia 

SO Statistical Applications in Genetics and Molecular Biology (2003), 2(1), No pp. 
given CODEN: SAGMCU; ISSN: 1544-6115 URL: 

http :// www.bepress.com/cgi/viewcontent.cgi ?a rticle = 10 19&context=sag mb 

PB Berkeley Electronic Press 

DT Journal; (online computer file) 

LA English 

AB Visualization methods for exploring microarray data are particularly important for 
gaining insight into data from gene expression expts., such as those concerned with 
the development of an understanding of gene function and interactions. Further, good 
visualization techniques are useful for outlier detection in microarray data and for 
aiding biol. interpretation of results, as well as for presentation of overall summaries of 
the data. The biplot is particularly useful for the display of microarray'data as both the 
genes and the chips can be simultaneously plotted. In this paper we describe several 
ordination techniques suitable for exploring microarray data, and we call these the GE- 
biplot, the Chip-plot and the Gene-plot. The general method is first evaluated on 
synthetic data ***simulated*** in accord with current biol. interpretation of 
***microarray*** data. Then it is applied to two well-known data sets, namely the 
colon data of Alon et al. (1999) and the leukemia data of Golub et al. (1999). The 
usefulness of the approach for interpreting and comparing different analyses of the 
same data is demonstrated. 

RE.CNT 41 THERE ARE 41 CITED REFERENCES AVAILABLE FOR THIS RECORD 
ALL CITATIONS AVAILABLE IN THE RE FORMAT 

L6 ANSWER 11 OF 407 CAPLUS COPYRIGHT 2006 ACS on STN 

AN 2004:623389 CAPLUS 
DN 142:310444 

TI Improving the spedficity of biological signal detection from microarray data 

AU Troyanskaya, Olga G. 

CS Stanford Univ., Stanford, CA, USA 

SO (2003) 118 pp. Avail.: UMI, Order No. DA3104167 From: Diss. Abstr. Int., B 2004, 
64(9), 4181 

DT Dissertation 

LA English 

AB Unavailable 

L6 ANSWER 12 OF 407 CAPLUS COPYRIGHT 2006 ACS on STN 

AN 2004:572371 CAPLUS 
DN 141:200855 

TI Probabilistic estimation of microarray data reliability and underlying gene 
expression 

AU Bilke, S.; Breslin, T.; Sigvardsson, M. 



Page 2 of 63 



Serial No. 10/501,848 
STN SEARCH - a 



CS Complex Systems Division, Department of Theoretical Physics, University of Lund, 
Lund, SE-22185, Swed. 

SO Los Alamos National Laboratory, Preprint Archive, Quantitative Biology (2003) 1- 

12, arXiv:q-bio.QM/0309006, 18 Sep 2003 CODEN: LANLCJ URL: 

http://xxx.lanl.gov/pdf/q-bio.QM/0309006 

PB Los Alamos National Laboratory 

DT Preprint 

LA English 

AB The availability of high throughput methods for measurement of mRNA concns. 
makes the reliability of conclusions drawn from the data and global quality control of 
samples and hybridization important issues. These issues were addressed by an 
information theoretic approach, applied to discretized expression values in replicated 
gene expression data. The approach yields a quant, measure of two important 
parameter classes: First, the probability P(.sigma.|S) that a gene is in the biol. state 
.sigma. in a certain variety, given its obsd. expression S in the samples of that variety. 
Second, sample specific error probabilities which serve as consistency indicators of the 
measured samples of each variety. The method and its limitations are tested on gene 
expression data for developing murine B-cells and a t-test is used as ref. On a set of 
known genes it performs better than the t-test despite the crude discretization into only 
two expression levels. The consistency indicators, i.e. the error probabilities, correlate 
well with variations in the biol. material and thus prove efficient. The proposed method 
is effective in detg. differential gene expression and sample reliability in replicated 
microarray data. Already at two discrete expression levels in each sample, it gives a 
good explanation of the data and is comparable to std. techniques. 
RE.CNT 22 THERE ARE 22 CITED REFERENCES AVAILABLE FOR THIS RECORD 
ALL CITATIONS AVAILABLE IN THE RE FORMAT 

L6 ANSWER 13 OF 407 CAPLUS COPYRIGHT 2006 ACS on STN 
AN 2004:567994 CAPLUS 
DN 141:272143 

TI Evaluation of sensitivity, performance and reproducibility of microarray technology 
in neuronal tissue 

AU Evans, S. J.; Watson, S. J.; Akil, H. 

CS Mental Health Research Institute, University of Michigan, Ann Arbor, MI, 48109, 
USA 

SO Integrative and Comparative Biology (2003), 43(6), 780-785 CODEN: ICBNBD; 
ISSN: 1540-7063 

PB Society for Integrative and Comparative Biology 
DT Journal 
LA English 

AB Microarray technol. is a powerful technique that allows the simultaneous study of 
thousands of gene transcripts. During the past two years there has been an explosion 
of publications describing expts. utilizing microarray technol. that range from original 
research findings from biol. paradigms to math, modeled systems. However, 
neuroscientists using microarray technol. face significant challenges due to high tissue 
complexity, low abundance transcripts, and small magnitude changes in transcript 
levels that have significant biol. impact. This manuscript describes a series of studies 
designed to address issues regarding microarray sensitivity, ability of microarrays to 
detect subtle changes, and reproducibility of microarray expts., all in the context of 
neuronal tissue. From the presentation of these studies, the authors argue that 
although microarray technol. is limited with regards to sensitivity, the outcome of these 
expts., if approached with appropriate skepticism, can be fruitful in the generation of 
hypotheses and seeding of future expts. 

RE.CNT 32 THERE ARE 32 CITED REFERENCES AVAILABLE FOR THIS RECORD 
ALL CITATIONS AVAILABLE IN THE RE FORMAT 

L6 ANSWER 14 OF 407 CAPLUS COPYRIGHT 2006 ACS on STN 
AN 2004:532664 CAPLUS 
DN 141:406692 

TI On bayesian modeling and design for microarray gene expression data 
AU JLYuan 

CS Univ. of Wisconsin, Madison, WI, USA 

SO (2003) 112 pp. Avail.: UMI, Order No. DA3101396 From: Diss. Abstr. Int., B 2004, 
64(8), 3889 
DT Dissertation 
LA English 
AB Unavailable 

L6 ANSWER 15 OF 407 CAPLUS COPYRIGHT 2006 ACS on STN 
AN 2004:513881 CAPLUS 
DN 141:255049 

TI Probe design for large-scale molecular biology applications 

AU VanBuren, Vincent; Yoshikawa, Toshiyuki; Hamatani, Toshio; Ko, Minoru S. H. 

CS National Institute on Aging, Laboratory of Genetics, Developmental Genomics and 

Aging Section, National Institutes of Health, Baltimore, MD, 21224, USA 

SO Proceedings of the IEEE Bioinformatics Conference, 2nd, Stanford, CA, United 

States, Aug. 11-14, 2003 (2003), Meeting Date 2003, 502-503 Publisher: IEEE 

Computer Society, Los Alamitos, Calif. CODEN: 69FOVN; ISBN: 0-7695-2000-6 

DT Conference 

LA English 

AB Large-scale mol. biol. technologies such as DNA microarrays and large-scale in situ 
hybridization (ISH) are used to gain an appreciation of global attributes in biol. tissues 
and cells. Although many of these efforts use cDNA probes, an approach that makes 
use of designed oligo probes should offer improved consistency at uniform 
hybridization conditions and improved specificity, as demonstrated by various oligo 
microarray platforms. We describe a new Web-based application that takes FASTA- 
formatted sequences as input, and returns both a list of the best choices for probes 
and a full report contg. possible alternatives. Probe design for microarrays may use a 
scoring routine that optimizes probe intensity based upon an artificial neural network 



(ANN) trained to predict the av. probe intensity from the phys. properties of the probe 
and a screen for possible cross-reactivity. This new tool should provide a reliable way 
to construct probes that maximize signal intensity while minimizing cross-reactivity. 
The Web-based Probe Hunter application is available at 
http://probeworkshop.grc.nia.nih.gov. 

RE.CNT 1 THERE ARE 1 CITED REFERENCES AVAILABLE FOR THIS RECORD 
ALL CITATIONS AVAILABLE IN THE RE FORMAT 

L6 ANSWER 16 OF 407 CAPLUS COPYRIGHT 2006 ACS on STN 
AN 2004:513878 CAPLUS 
DN 141:237255 

TI Gene selection for multi-class prediction of microarray data 

AU Chen, Dechang; Hua, Dong; Reifman, Jaques; Cheng, Xiuzhen 

CS Uniformed Services, University of the Health Sciences, Israel 

SO Proceedings of the IEEE Bioinformatics Conference, 2nd, Stanford, CA, United 

States, Aug. 11-14, 2003 (2003), Meeting Date 2003, 492-495 Publisher: IEEE 

Computer Society, Los Alamitos, Calif. CODEN: 69FOVN; ISBN: 0-7695-2000-6 

DT Conference 

LA English 

AB Gene expression data from microarrays have been successfully applied to class 
prediction, where the purpose is to classify and predict the diagnostic category of a 
sample by its gene expression profile. A typical microarray dataset consists of 
expression levels for a large no. of genes on a relatively small no. of samples. As a 
consequence, one basic and important question assocd. with class prediction is; how 
do we identify a small subset of informative genes contributing the most to the 
classification task. Many methods have been proposed but most focus on two-dass 
problems, such as discrimination between normal and disease samples. This paper 
addresses selecting informative genes for multi-class prediction problems by jointly 
considering all the classes simultaneously. Our approach is based on the power of the 
genes is discriminating among the different classes (e.g., tumor types) and the existing 
correlation between genes. We formulate the expression levels of a given gene by a 
one-way anal, of variance model with heterogeneity of variances, and det. the 
discriminatory power of the gene by a test statistic designed to test the equality of the 
class means. In other words, the discriminatory power of a gene is assocd. with a 
Behrens-Fisher problem. Informative genes are chosen such that each selected gene 
has a high discriminatory power and the correlation between any pair of selected genes 
is low. Test statistics considered in this paper include the ANOVA F test statistic, the 
Brown-Forsythe test statistic, the Cochran test statistic, and the Welch test statistic. 
Their performances are evaluated over several classification methods applied to two 
publicly available microarray datasets. The results show that Brown-Forsythe test 
statistic achieves the best performance. 
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AB A 'MageBuilder' object takes a set of 'MageMap' objects and a set of data streams 
as input, and produces a MAGEstk object representation, which is then serialized as 
MAGE-ML. A 'MageMap' object encapsulates the rules of how data records from an 
input stream relate to one MAGE object. Each input "stream" is an anonymous 
subroutine that supplies records whose fields represent columns in the input table. The 
input tables can be delimited text files, database queries, or essentially any source that 
can be coerced into a set of records with fixed fields. 
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AB Array comparative genomic hybridization (cgh) is a microarray technol. for 
measuring the relative copy no. of thousands of genomic regions. Visual examn. of cgh 
profiles shows that genomic changes occur on a variety of length scales. Such changes 
may be characteristic of phenotypic variables such as tumor type and gene mutational 
status. To aid in identifying such features and exploring their relationship with 
phenotypic outcomes, we are applying wavelet transforms to the anal, of such profiles. 
This allows us to decomp. a cgh signal into components on different length scales, 
even when the genome is severely aberrated, providing a convenient basis for 
exploring their behavior. Wavelet transforms may also be useful in the realm of gene 
expression. The expression signal given by genes in clustered order can be wavelet 
transformed, which compresses the signal from many genes into a few components, 
possibly aiding in the development of new tumor classifiers. 
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AB We created a web-based microarray data anal, pipeline for managing the vols, of 
data created by prodn. microarray expts. Expts. are formalized by grouping array data 
into hierarchies based on types such as "dye swap" or "replicate.". Grouping dets. the 
anal, to be performed and enables the tool to automatically generate reports and 
charts appropriate to the expt. results. Subsets of data across arrays may also be 
hierarchically grouped into types such as "gene" or "list.". The group hierarchy is 
similar to a document object model (DOM), which enables queries to be posed in an 
XPath or XQuery language. Analyzer modules provide the complicated statistical 
processing and may be custom written or implemented as wrappers around existing 
tools. For speculative data anal, or publication, the results may be exported to a std. 
format. 

RE.CNT 2 THERE ARE 2 CITED REFERENCES AVAILABLE FOR THIS RECORD 
ALL CITATIONS AVAILABLE IN THE RE FORMAT 

L6 ANSWER 20 OF 407 CAPLUS COPYRIGHT 2006 ACS on STN 
AN 2004:513814 CAPLUS 
DN 141:237619 

TI Fourier harmonic approach for visualizing temporal patterns of gene expression 
data 

AU Zhang, Li; Zhang, Aidong; Ramanathan, Murali 

CS Department of Computer Science and Engineering, State University of New York at 
Buffalo, Buffalo, NY, 14260, USA 

SO Proceedings of the IEEE Bioinformatics Conference, 2nd, Stanford, CA, United 
States, Aug. 11-14, 2003 (2003), Meeting Date 2003, 137-147 Publisher: IEEE 
Computer Society, Los Alamitos, Calif. CODEN: 69FOVN; ISBN: 0-7695-2000-6 
DT Conference 
LA English 

AB DNA microarray technol. provides a broad snapshot of the state of the cell by 
measuring the expression levels of thousands of genes simultaneously. Visualization 
techniques can enable the exploration and detection of patterns and relationships in a 
complex dataset by presenting the data in a graphical format in which the key 
characteristics become more apparent. The purpose of this study is to present an 
interactive visualization technique conveying the temporal patterns of gene expression 
data in a form intuitive for non-specialized end-users. The first Fourier harmonic 
projection (FFHP) was introduced to translate the multi-dimensional time series data 
into a two dimensional scatter plot. The spatial relationship of the points reflect the 
structure of the original dataset and relationships among clusters become two 
dimensional. The proposed method was tested using two published, array-derived 
gene expression datasets. These results demonstrate the effectiveness of the 
approach. 
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AB We propose a statistical method for estg. a gene network based on Bayesian 
networks from microarray gene expression data together with biol. knowledge including 
protein -protein interactions, protein-DNA interactions, binding site information, existing 
literature and so on. Unfortunately, microarray data do not contain enough information 
for constructing gene networks accurately in many cases. Our method adds biol. 
knowledge to the estn. method of gene networks under a Bayesian statistical 
framework, and also controls the trade-off between microarray information and biol. 
knowledge automatically. We conduct Monte Carlo simulations to show the 
effectiveness of the proposed method. We analyze Saccharomyces cerevisiae gene 
expression data as an application. 
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AB The oligo microarray (DNA chip) technol. in recent years has a significant impact 
on genomic study. Many fields such as gene discovery, drug discovery, toxicol. 
research and disease diagnosis, will certainly benefit from its use. A microarray is an 
orderly arrangement of thousands of DNA fragments where each DNA fragment is a 
probe (or a fingerprint) of a gene/cDNA. It is important that each probe must uniquely 
assoc. with a particular gene/cDNA. Otherwise, the performance of the microarray will 
be affected. Existing algorithms usually select probes using the criteria of 
homogeneity, sensitivity, and specificity. Moreover, they improve efficiency employing 
some heuristics. Such approaches reduce the accuracy. Instead, the authors make 
use of some smart filtering techniques to avoid redundant computation while 
maintaining the accuracy. Based on the new algorithm, optimal short (20 bases) or 
long (50 or 70 bases) probes can be computed efficiently for large genomes. 
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AB The design of large scale DNA microarrays is a challenging problem. So far, probe 
selection algorithms must trade the ability to cope with large scale problems for a loss 
of accuracy in the estn. of probe quality. Tha author presents an approach based on 
jumps in matching statistics that combines the best of both worlds. This article 
consists of two parts. The first part is theor. The author introduces the notion of jumps 
in matching statistics between two strings and derive their properties. The author ests. 
the frequency of jumps for random strings in a non-uniform Bernoulli model and 
present a new heuristic argument to find the center of the length distribution of the 
longest substring that two random strings have in common. The results are 
generalized to near-perfect matches with a small no. of mismatches. In the second 
part, the author uses the concept of jumps to improve the accuracy of the longest 
common factor approach for probe selection by moving from a string-based to an 
energy-based specificity measure, while only slightly more than doubling the selection 
time. 
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AB The user-friendly MicroPreP framework was developed to transform raw intensity 
data from cDNA microarrays into high-quality data. The main features of this software 
are: LOWESS normalization; merging of DNA microarray data from changing slide 
versions; outlier detection; and slide quality assessment. The software is available at 
http: // molgen . biol. rug.nl/ molgen/research/ molgensoftwa re . php . 
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AB A hybrid GA (genetic algorithm)-based clustering (HGACLUS) schema, combining 
merits of the Simulated Annealing, was described for finding an optimal or near-optimal 
set of methods. This schema maximized the clustering success by achieving internal 
duster cohesion and external cluster isolation. The performance of HGACLUS and 
other methods was compared by using ***simulated*** data and open 
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***microarray*** gene-expression datasets. HGACLUS was generally found to be 
more accurate and robust than other methods discussed in this paper by the exact 
validation strategy and the explicit cluster no. 
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AB Gene expression anal, using high-throughput microarray technol. has become a 
powerful approach to study systems biol. The exponential growth in microarray expts. 
has spawned a no. of investigations into the reliability and reproducibility of this type of 
data. However, the sample size requirements necessary to obtain statistically 
significant results has not had as much attention. The statistical methods for the detn. 
of the sufficient no. of subjects necessary to minimize the false discovery rate while 
maintaining high power to detect differentially expressed genes was reported here. 
Two exptl. designs were considered: a comparison between two groups at a single time 
point, and a comparison of two exptl. groups with sequential time points. Computer 
programs are available for the methods discussed in this paper and are adaptable to 
more complicated situations. 
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AB A review aimed at introducing the reader to the basic concepts underlying the 
statistical and data mining methods used for the anal, of microarray data. An attempt 
is made to provide an introductory review and a basic guide on microarray data anal, 
strategies, complex math, equations and their computational implementations. This 
review does not include early comparative gene expression anal, using microarrays 
where gene expression was established in terms of fold change. An inherent problem 
with this criterion is that genes with low abs. expression levels have a greater inherent 
error in their measurements and are more likely than higher expressing genes to meet 
any fold change cut-off. Different concepts related to the microarray data anal, 
process including microarray gene expression matrix, outliers, missing values, distance 
functions, unsupervised and supervised methods, advantages, limitations and 
considerations to est. their reliability are presented. When it is necessary, biol. 
examples are provided with the aim to highlight the relevance of some microarray data 
anal, methods. This review also introduces information about software (public and 
private) that can help the reader to choose suitable tools for the anal, of their particular 
microarray gene expression data. Some aspects about the implementation of 
microarray data repositories, the development of stds. including the Min. Information 
About Microarray Expts. (MIAME) and its computational implementation through MAGE- 
OM, MAGE-ML and MAGE-stk are highlighted. Finally, current trends and future 
challenges that microarray technol. will present are also summarized. 
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AB A review and discussion of protocols for the prodn. of DNA and protein 
microarrays with the consistency and accuracy needed for FDA approval as diagnostic 
and drug testing devices. The regulation of robotics, surface chem., sample prepn. and 
environment in the prodn. of microarrays is discussed. 
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AB A microarray-based detection system was developed, which employs 16S rDNA 
sequence as its detection target for rapid and efficient detection of rickettsiae. Specific 
probes targeting 6 species of rickettsiae were designed by using some bioinformatics 
softwares and methods. Ten strains of Bickettsiae were tested by using this microarray 
system. The results showed that Bartonella henselae, Orientia tsutsugamushi, 
Rickettsia rickettsii, R. prowazeki, Coxiella burnetii could be detected at species or 
genus level. For example, R. rickettsii had a cross reaction with 3 out of 4 probes 
specifically targeting R. prowazeki i, while R. prowazeki i hybridized with only 2 
R.rickettsii's probes. Ehrlichia canis was neg. throughout the whole expt. and the 
reason was under evaluation. The sensitivity assay was performed by employing serial 
diln. of C. burnetii chromosomal DNA. The sensitivity of detection system used was 
found to be 10 times higher than that of PCR-electrophoresis. The oligonucleotide 
microarray could det. most of the test strains at species level. The overall time for 
sample process, hybridization and data acquisition lasts about 4.5 h. The 
oligonucleotide microarray can be used for the detection of rickettsiae. 
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AB A review. High-throughput inferential methods such as domain fusion, 
chromosomal proximity, and phylogenetic profiling and exptl. technologies including 
high-d. arrays of probes for detecting expressed genes are discussed. The application 
of microarrays to classification of cancers and cancer pathogenesis as well as to host 
response to pathogens are also discussed. 
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AB A review discusses the utility of oligo-probe microarrays for expression profiling in 
comparison with cDNA-probe microarrays. It describes the computational 
considerations for oligo-probe attachment, the chem. requirements of oligo-probe 
attachment, labeling methodologies, and the use of oligo-probe arrays for single 
nucleotide polymorphism detection. 
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AB Obtaining abundant information on the expression of many genes occurring 
simultaneously in cells by the DNA array anal, technol. is very important in the life 
science research. However, the data processing method affects the result greatly. In 
this paper, the program named as EX-ARRAY was prepd. to verify the macro-array 
data, which was analyzed by 33P labeling probe, and examd. The original data was 
obtained from the software Array Gauge and was processed by setting two different 
backgrounds. Two resulting data were exported as text files, and were input to EX- 
ARRAY. Processing through this program enhanced the reliability on data anal. 
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AB A variety of new procedures have been devised to handle the two-sample 
comparison (e.g., tumor vs. normal tissue) of gene expression values as measured with 
microarrays. Such new methods are required in part because of some defining 
characteristics of microarray-based studies: (i) the very large no. of genes contributing 
expression measures which far exceeds the no. of samples (observations) available and 
(ii) the fact that by virtue of pathway/network relationships, the gene expression 
measures tend to be highly correlated. These concerns are exacerbated in the 
regression setting, where the objective is to relate gene expression, simultaneously for 
multiple genes, to some external outcome or phenotype. Correspondingly, several 
methods have been recently proposed for addressing these issues. We briefly critique 
some of these methods prior to a detailed evaluation of gene harvesting. This reveals 
that gene harvesting, without addnl. constraints, can yield artifactual solns. Results 
obtained employing such constraints motivate the use of regularized regression 
procedures such as the lasso, least angle regression, and support vector machines. 
Model selection and soln. multiplicity issues are also discussed. The methods are 
evaluated using a microarray-based study of cardiomyopathy in transgenic mice. 
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AB A review, with refs. In the past several years many linear models have been 
proposed for analyzing two-color microarray data. As presented in the literature, many 
of these models appear dramatically different. However, many of these models are 
reformulations of the same basic approach to analyzing microarray data. This paper 
demonstrates the equivalence of some of these models. Attention is directed at 
choices in microarray data anal, that have a larger impact on the results than the 
choice of linear model. 
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AB DNA chips are used to study the compn. of genetic material. We report the results 
of an exptl. study of the synthesis of DNA microarrays using a maskless 



photodeprotection process. In these "chips," the quality of the final product is 
dependent on the type and frequency of errors in the synthesis of the oligonucleotides. 
Contrary to photoresist, the photochem. is linear and thus more prone to the 
introduction of defects. To understand and characterize the exposure process, we have 
developed a theor. image formation model based on std. lithog. modeling tools. Exptl., 
we have used a microarray synthesizer similar to that described in (Ref. 1), but using 
an argon ion laser as radiation source. To characterize the process, we have acquired 
aerial images using a CCD camera, a photosensitive film, and fluorescence image of a 
T-base monomer. We will discuss the imaging properties of the optical system, the 
models used to analyze the data and the relation between measured images and DNA 
stepwise synthesis yield. 
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Wang, Xujing; Hessner, Martin J. 

CS Max McGee National Research Center for Juvenile Diabetes, Department of 
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SO Annals of the New York Academy of Sciences (2003), 1005(1 mmunology of 
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PB New York Academy of Sciences 
DT Journal 
LA English 

AB We have created an immunol.-related microarray chip contg. primarily known 
genes with well-studied functional properties. By looking at known genes rather than 
expressed sequence tags, we hope to gain a better understanding of immunol. 
pathways and how they work. The immunol. gene chip contains genes from the 
following functional categories: T cell genes; B cell genes; dendritic cell genes; 
chemokine and cytokine genes; apoptosis genes; cell cycle genes; cell interaction 
genes; general hematol. and immunol. genes; and adhesion genes. We have also 
developed a novel three-color cDNA array platform in which arrays are directly 
visualized before hybridization, which allows us to select only high-quality chips for our 
expts. In an effort to provide quant, quality control for each array element as well as 
the entire chip, we have developed Matarray, a software package for image processing 
and data acquisition. With Matarray, we have built a quant, data filtering and 
normalization scheme that has proved to be more efficient than the existing methods. 
The list of immunol. chip genes is available from the authors. 
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AB Cross hybridization on microarrays generates signals from unintended genes, 
which presents a special challenge in gene expression profiling studies since it directly 
leads to false positives. However, little is known about the extent of cross hybridization 
and why certain probes are particularly prone to cross hybridization. Recently, we 
developed a free-energy model of binding interactions on oligonucleotide arrays that 
can decomp. the obsd. probe signals in terms of the effects of gene-specific and 
generic non-specific binding. We analyzed the data set provided by Affymetrix Inc., 
which followed a Latin square design with 14 genes spiked-in at various concns. 
Around 31 prooesets show reproducible response to the spiked-in genes. In most 
cases, we were able to ext. the amt. of cross hybridization signal and identify the 
source, i.e., the fragments of spiked-in genes that match the cross hybridizing probes. 
These findings demonstrate the utility of our model for identifying spurious cross- 
hybridization signals and obtaining robust measure of gene expression levels. 
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AB Biol, and tech. variances were estd. from the Project Normal data using the mixed 
model anal, of variance. The tech. variance is larger than the biol. variance in most 
genes. In expts. for detecting treatment effects using a ref. design, increasing the no. 
of mice per treatment is more effective than pooling mice or increasing the no. of 
arrays per mouse. For a given no. of arrays, more mice per treatment with fewer arrays 
per mouse are more powerful than fewer mice per treatment with more arrays per 
mouse. A formula is provided for computing the optimum no. of arrays per mouse to 
minimize the total cost of the expt. 
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AB We consider two linear mixed models for the normal mouse data [Pritchard et al., 
2001]. One models the log2 intensity measurements directly and the other models the 
log2 ratios. In each approach, we treat a mouse as a fixed effect, and alternatively, we 
also model it as a random effect to assess its variability directly. We compare the 
results from these mixed model approaches. The models agree that array variance is 
much larger than other sources of variability, but differ somewhat in their lists of genes 
exhibiting the most significant mouse effects. Under a Bonferroni criterion, the ratio- 
based model we consider produces more genes with significant mouse effects than the 
intensity-based model, but fewer genes with significant tissue effects. Both models 
demonstrate a general statistical framework for concurrently estg. sources of variability 
and assessing their significance. 
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AB In a study done by Pritchard et al. [2001], normal gene expression variation was 
examd. in six genetically identical male mice, to det. a baseline variation for gene 
expression studies in mice. In this paper, we use data from their study to accomplish 
the following three goals:. 1) Evaluate five data normalization procedures along with 
two methods omitting data normalization, and study their impact on identifying 
baseline differentially expressed genes;. 2) Perform pair-wise comparisons using 
McNemar's tests on five normalization methods and two methods omitting the 
normalization step;. 3) Address data quality issues and examine the effect of 
normalization on anal, results for genes that do not meet either or both of the two data 
quality criteria. Depending on which normalization method is used, whether omitting 
the normalization step or not, the no. of genes and the set of the genes identified as 
differentially expressed from the same study can be substantially different. Anal, 
demonstrates that when data quality is not ensured, performing normalization can add 
noise to the data and can bias gene-based ANOVA results. Thus we conclude that 
ensuring data quality and establishing quality control measures is crucial to increase 
the effectiveness of normalization procedures and the accuracy of data anal, results. 
The study also reconfirmed that proper exptl. design and establishing rigorous data 
quality control stds. are indispensable factors for the success of a microarray expt. 
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AB We developed methods for characterizing a set of ***microarray*** images 
and for subsequently ***simulating*** ***microarray*** images with statistical 
properties similar to those in the original set. Characterization involved measuring 
properties of individual spots and performing anal, of variance to det. the relative 
contributions of individual pins used for printing and individual slides to the variation 
obsd. in spot phys. properties, slide background properties, and intensity of individual 
genes. Slide backgrounds and individual spot nonuniformities were modeled as 2D 
causal Markov random fields, and parameters for these were derived from the set of 
real images. The results of the characterization were then used to generate realistic 
replicates of the original dataset that can be used for evaluating microarray data 
processing and anal, techniques. We demonstrated the process on a set of microarray 
images derived from a mouse kidney expt. The characterization of these images 
showed that slides from two of the six mice have significantly different spot properties 
from the rest. Simulated images from the set are shown to realistically model most 
properties of the slides, save for large handling defects. We concluded that 
characterization should be an important part of any ***microarray*** expt. to 
maintain quality control, and that realistic ***simulations*** of ***microarray*** 
images can be produced using these methods. 
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AB The present invention is directed to systems and methods for assessing the 
success of the transplant of a cell, tissue, or organ before and after transplant. Protein 
array technol. is used to obtain a biomarker pattern for the cell, tissue, or organ that is 
being considered for transplant or that has been transplanted. Samples for the 
identification of biomarkers and biomarker patterns are obtained from the cell, tissue or 
organ itself, or from a body fluid of the donor or recipient. Sample biomarker data are 
compared to ref. biomarker data obtained from donors, recipients or cells, tissues or 
organs that have been transplanted. Correlation of a sample biomarker pattern with 
the ref. biomarker pattern, where transplant outcome for the samples used for the ref. 
biomarkers is known, permits a suggested treatment detn. A computerized system to 
identify the condition of transplant before or after implantation is also provided. 
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AB Human group A rotavirus (HRV) is the major cause of severe gastroenteritis in 
infants worldwide. HRV shares the feature of a high degree of genetic diversity with 
many other RNA viruses, and therefore, genotyping of this organism is more 
complicated than genotyping of more stable DNA viruses! We describe a novel 
microarray-based method that allows high-throughput genotyping of RNA viruses with 
a high degree of polymorphism by multiplex capture and type-specific extension on 
microarrays. Denatured reverse transcription (RT)-PCR products derived from two 
outer capsid genes of din. isolates of HRV were hybridized to immobilized capture 
oligonucleotides representing the most commonly occurring P and G genotypes on a 
microarray. Specific primer extension of the type-specific capture oligonucleotides was 
applied to incorporate the fluorescent nucleotide analog cyanine 5-labeled dUTP as a 
detectable label. Laser scanning and fluorescence detection of the 
***microarrays*** was followed by visual or ***computer*** -assisted 
interpretation of the fluorescence patterns generated on the *** microarrays*** . 
Initially, the method detected HRV in all 40 samples and correctly detd. both the G and 
the P genotypes of 35 of the 40 strains analyzed. After modification by inclusion of 
addnl. capture oligonucleotides specific for the initially unassigned genotypes, all 
genotypes could be correctly defined. The results of genotyping with the microarray 
fully agreed with the results obtained by nucleotide sequence anal, and sequence- 
specific multiplex RT-PCR. Owing to its robustness, simplicity, and general utility, the 
microarray-based method may gain wide applicability for the genotyping of 
microorganisms, including highly variable RNA and DNA viruses. 
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AB One objective of systems biol. is to create predictive, quant, models of trie 
transcriptional regulation networks that govern numerous cellular processes. Gene 
expression measurements, as provided by microarrays, are commonly used in studies 
that attempt to infer the regulation underlying these processes At present, most gene 
expression models that have been derived from microarray data are based in discrete- 
time, which have limited applicability to common biol. data sets, and may impede the 
integration of gene expression models with other models of biol. processes that are 
formulated as ordinary differential equations (ODEs). To overcome these difficulties, a 
continuous-time approach for process identification to identify gene expression models 
based in ODEs was developed. The approach utilizes the modulating functions method 
of parameter identification. The method was applied to three simulated systems: (1) a 
linear gene expression model, (2) an autoregulatory gene expression model, and (3) 
***simulated*** ***microarray*** data from a nonlinear transcriptional network. 
In general, the approach was well suited for identifying models of gene expression 
dynamics, capable of accurately identifying parameters for small nos. of data samples 
in the presence of modest expti. noise. Addnl., numerous insights about gene 
expression modeling were revealed by the case studies. 
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AB Oligonucleotide microarrays have demonstrated potential for the anal, of gene 
expression, genotyping, and mutational anal. Our work focuses primarily on the 
detection and identification of bacteria based on known short sequences of DNA. Oligo 
Design, the software described here, automates several design aspects that enable the 
improved selection of oligonucleotides for use with microarrays for these applications. 
Two major features of the program are: first, a tiling algorithm for the design of short 
overlapping temp.-matched oligonucleotides of variable length, which are useful for the 
anal, of single nucleotide polymorphisms. Second, a set of tools for the anal, of multiple 
alignments of gene families and related short DNA sequences, which allow for the 
identification of conserved DNA sequences for PCR primer selection and variable DNA 
sequences for the selection of unique probes for identification. Note that the program 
does not address the full genome perspective but, instead, is focused on the genetic 
anal, of short segments of DNA. The program is Internet-enabled and includes a built- 
in browser and the automated ability to download sequences from Gen Bank by 
specifying the GI no. The program also includes several utilities, including audio recital 
of a DNA sequence (useful for verifying sequences against a written document), a 
random sequence generator that provides insight into the relationship between melting 
temp, and GC content, and a PCR calculator. 
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AB Cancer diagnosis using gene expression profiles requires supervised learning and 
gene selection methods. Of the many suggested approaches, the method of emerging 
patterns (EPs) has the particular advantage of explicitly modeling interactions among 
genes, which improves classification accuracy. However, finding useful (i.e. short and 
statistically significant) EP is typically very hard. Here we introduce a CART-based 
approach to discover EPs in microarray data. The method is based on growing decision 
trees from which the EPs are extd. This approach combines pattern search with a 
statistical procedure based on Fisher's exact test to assess the significance of each EP. 
Subsequently, sample classification based on the inferred EPs is performed using max.- 
likelihood linear discriminant anal. Using simulated data as well as gene expression 
data from colon and leukemia cancer expts. we assessed the performance of our 



pattern search algorithm and classification procedure. In the simulations, our method 
recovers a large proportion of known EPs while for real data it is comparable in 
classification accuracy with three top-performing alternative classification algorithms. 
In addn., it assigns statistical significance to the inferred EPs and allows to rank the 
patterns while simultaneously avoiding overfit of the data. The new approach 
therefore provides a versatile and computationally fast tool for elucidating local gene 
interactions as well as for classification. 
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AB His paper presents a cluster validation tool for gene expression data. Machaon 
CVE (Clustering and Validation Environment) system aims to partition samples or genes 
into groups characterized by similar expression patterns, and to evaluate the quality of 
the clusters obtained. 
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AB Summary: New techniques in sample prepn. allow high throughput anal, of 
samples on the transcriptional as well as on the metabolic level. We present a service 
accessible via the web that allows the anal, of integrated data sets that combine gene- 
expression data and metabolic data. After uploading, data sets can be normalized, 
clustered by various methods and results can be graphically visualized. All calcns. are 
carried out on a server, so even time- and memory-consuming analyses can be done 
independently of the performance of the client. 
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AB Bayesian networks have been applied to infer genetic regulatory interactions from 
microarray gene expression data. This inference problem is particularly hard in that 
interactions between hundreds of genes have to be learned from very small data sets, 
typically contg. only a few dozen time points during a cell cycle. Most previous studies 
have assessed the inference results on real gene expression data by comparing 
predicted genetic regulatory interactions with those known from the biol. literature. 
This approach is controversial due to the absence of known gold stds., which renders 
the estn. of the sensitivity and specificity, i.e., the true and (complementary) false 
detection rate, unreliable and difficult. The objective of the present study is to test the 
viability of the Bayesian network paradigm in a realistic simulation study. First, gene 
expression data are simulated from a realistic biol. network involving DNAs, mRNAs, 
inactive protein monomers and active protein dimers. Then, interaction networks are 
inferred from these data in a reverse engineering approach, using Bayesian networks 
and Bayesian learning with Markov chain Monte Carlo. Results: The simulation results 
are presented as receiver operator characteristics curves. This allows estg. the 
proportion of spurious gene interactions incurred for a specified target proportion of 
recovered true interactions. The findings demonstrate how the network inference 
performance varies with the training set size, the degree of inadequacy of prior 
assumptions, the exptl. sampling strategy and the inclusion of further, sequence-based 
information. 
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AB Expti. limitations have resulted in the popularity of parametric statistical tests as a 
method for identifying differentially regulated genes in microarray data sets. However, 
these tests assume that the data follow a normal distribution. To date, the assumption 
that replicate expression values for any gene are normally distributed, has not been 
critically addressed for Affymetrix GeneChip data. The normality of the expression 
values calcd. using four different com. and academic software packages was 
investigated using a data set consisting of the same target RNA applied to 59 human 
Affymetrix U95A GeneChips using a combination of statistical tests and visualization 
techniques. For the majority of probe sets obtained from each anal, suite, the 
expression data showed a good correlation with normality. The exception was a large 
no. of low-expressed genes in the data set produced using Affymetrix Microarray Suite 
5.0, which showed a striking non-normal distribution. In summary, our data provide 
strong support for the application of parametric tests to GeneChip data sets without the 
need for data transformation. 
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AB This paper introduces a MATLAB toolbox, MGraph, which applies graphical models 
as a natural environment to formulate and solve problems in microarray data anal. 
MGraph with its graphical interface allows the user to predict genetic regulatory 
networks by a graphical gaussian model (GGM), and to quantify the effects of different 
exptl. treatment conditions on gene expression profiles by a graphical log-linear model 
(GLM). The power of graphical models was explored and illustrated through two 
example applications. First, four MAPK pathways in yeast were meaningfully 
reconstructed through GGM. Second, GLM was used to quantify the contributions of 
sex, genotype and age to transcriptional variance in Drosophila melanogaster. This 
application may provide a valuable aid in the prediction of genetic regulatory networks, 
as well as in investigations of various exptl. conditions that affect global gene 
expression profiles. Availability: The MATLAB program MGraph is freely available at for 
academics at http://www.uio.no/~junbaiw/mgraph/mgraph.htm I. 
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AB Bioinformatics anal, plays an integrative role in genomics and functional genomics. 
The ability to conduct quality managed, hypothesis-driven bioinformatics anal, with the 
plethora of data available is mandatory. Biol, interpretation of this data is dependent on 
versions of databases, programs and the parameters used. Thus, tracking and auditing 
the analyses process is important. This paper outlines what we term Bioinformatics 
Anal. Audit Trails (BAATs) and describes YABI, a bioinformatics environment that 
implements BAATs. YABI can incorporate most bioinformatics tools within the same 
environment, making it a valuable resource. 
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AB The development of microarray technol. has allowed researchers to measure 
expression levels of thousands of genes simultaneously. Anal, of these data requires 
the best normalization and statistical approaches to account for the biol. and tech. 
variability inherent in the technique. To approach this problem we have developed a 
publicly available * "simulator*** of ***microarray*** hybridization expts. that 



can be used to help assess the accuracy of bioinformatic tools in discovering significant 
genes. After analyzing microarray hybridization expts. from over 50 samples, an est. of 
various degrees of tech. and biol. variability was obtained. This information was used 
to develop a ***simulator*** of ***microarray*** hybridization data which 
modeled "normal-tissue samples" and "diseased tissue samples" with known, defined, 
changes in gene expression (a "gold std."). The data derived from the simulator were 
then used to evaluate the true pos. and false neg. rates of several normalization 
procedures and gene selection techniques. We found that the type of normalization 
approach used was an important aspect of data anal. Global normalization was the 
least accurate approach. Evaluation of gene selection techniques showed that 
"Significance anal, of microarrays" (SAM) and "Patterns of Gene Expression" (PaGE) 
were more accurate than simple t-test anal. We provide access to the 
***microarray*** hybridization * "simulator*** as a public resource for biologists 
to further test new emerging genomic bioinformatic tools. 
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TI Global analysis of ligand sensitivity of estrogen inducible and suppressible genes in 
MCF7/BUS breast cancer cells by DNA microarray 

AU Coser, Kathryn R.; Chesnes, Jessica; Hur, Jingyung; Ray, Sandip; Isselbacher, Kurt 
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CS Department of Tumor Biology and DNA Microarray Core Facility, Massachusetts 

General Hospital Cancer Center, Chariestown, MA, 02129, USA 
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DT Journal 
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AB To obtain comprehensive information on 17.beta.-estradiol (E2) sensitivity of 
genes that are inducible or suppressible by this hormone, we designed a method that 
dets. ligand sensitivities of large nos. of genes by using DNA *** microarray*** and 
a set of simple Peri ***computer*** scripts implementing the std. metric statistics. 
We used it to characterize effects of low (0-100 pM) concns. of E2 on the transcription 
profile of MCF7/BUS human breast cancer cells, whose E2 dose-dependent growth 
curve satd. with 100 pM E2. Evaluation of changes in mRNA expression for all genes 
covered by the DNA microarray indicated that, at a very low concn. (10 pM), E2 
suppressed .apprxeq.3-5 times larger nos. of genes than it induced, whereas at higher 
concns. (30-100 pM) it induced .apprxeq. 1.5-2 times more genes than it suppressed. 
Using clearly defined statistical criteria, E2-inducible genes were categorized into 
several classes based on their E2 sensitivities. This approach of hormone sensitivity 
anal, revealed that expression of two previously reported E2-inducible autocrine growth 
factors, transforming growth factor .alpha, and stromal cell-derived factor 1, was not 
affected by 100 pM and lower concns. of E2 but strongly enhanced by 10 nM E2, which 
was far higher than the concn. that satd. the E2 dose-dependent growth curve of 
MCF7/BUS cells. These observations suggested that biol. actions of E2 are derived 
from expression of multiple genes whose E2 sensitivities differ significantly and, hence, 
depend on the E2 concn., esp. when it is lower than the satg. level, emphasizing the 
importance of characterizing the ligand dose-dependent aspects of E2 actions. 
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PRAI JP 2002-154773 A 20020528 WO 2O03-JP6677 W 20030528 
AB A method, app., and computer programs are disclosed for analyzing and 
comparing data obtained from DNA microarray expts. Hybridization, normalization, and 
various statistical anal, steps such as Euclidean distance, regression correction, are 
used. Use of multiple control genes as spike-in RNAs, specifically, Renilla luciferase 
gene, bacutovirus gp64 gene, and .lambda, phage ea22 gene, and housekeeping 
genes, is claimed. 

RE.CNT 6 THERE ARE 6 CITED REFERENCES AVAILABLE FOR THIS RECORD 
ALL CITATIONS AVAILABLE IN THE RE FORMAT 

L6 ANSWER 60 OF 407 CAPLUS COPYRIGHT 2006 ACS on STN 
AN 2003:940843 CAPLUS 
DN 140:175838 

TI EMMA: a platform for consistent storage and efficient analysis of microarray data 
AU Dondrup, Michael; Goesmann, Alexander; Bartels, Daniela; Kalinowski, Jom; 
Krause, Lutz; Unke, Burkhard; Rupp, Oliver; Sczyrba, Alexander; Puhler, Alfred; Meyer, 
Folker 

CS Center for Genome Research, Bielefeld University, Bielefeld, D-33594, Germany 
SO Journal of Biotechnology (2003), 106(2-3), 135-146 CO DEN: JBJTD4; ISSN: 0168- 
1656 

PB Elsevier Science B.V. 
DT Journal 
LA English 

AB As a high throughput technique, microarray expts. produce large data sets, 
consisting of measured data, lab. protocols, and expti. settings. We have implemented 
the open source platform EMMA to store and analyze these data. The system provides 
automated pipelines for data processing and has a modular architecture that can be 
easily extended. EMMA features detailed reports about spots and their corresponding 
measurements. In addn. to routine data anal, algorithms, the system can be 
integrated with other components that contain addn I. data sources (e.g. genome 
annotation systems). 
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AB The variability of results in microarray technol. is in part due to the fact that 
independent scans of a single hybridized microarray give spot images that are not quite 
the same. To solve this problem and turn it to our advantage, we introduced the 
approach of multiple scanning and of image integration of microarrays. To this end, 
we have developed specific software that creates a virtual image that statistically 
summarizes a series of consecutive scans of a microarray. We provide evidence that 
the use of multiple imaging (i) enhances the detection of differentially expressed 
genes; (ii) increases the image homogeneity; and (iii) reveals false-pos. results such as 
differentially expressed genes that are detected by a single scan but not confirmed by 
successive scanning replicates. The increase in the final no. of differentially expressed 
genes detected in a microarray expt. with this approach is remarkable; 50% more for 
microarrays hybridized with targets labeled by reverse transcriptase, and 200% more 
for microarrays developed with the tyramide signal amplification (TSA) technique. The 
results have been confirmed by semi-quant. RT-PCR tests. 
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AB A review. Dynamic Bayesian networks (DBNs) are considered as a promising 
model for inferring gene networks from time series microarray data. DBNs have 
overtaken Bayesian networks (BNs) as DBNs can construct cyclic regulations using time 
delay information. In this paper, a general framework for DBN modeling is outlined. 
Both discrete and continuous DBN models are constructed systematically and criteria 
for learning network structures are introduced from a Bayesian statistical viewpoint. 
This paper reviews the applications of DBNs over the past years. Real data applications 
for Saccharomyces cerevisiae time series gene expression data are also shown. 
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AB In microarray data there are a no. of biol. samples, each assessed for the level of 
gene expression for a typically large no. of genes. There is a need to examine these 
data with statistical techniques to help discern possible patterns in the data. The 
technique applies a combination of math, and statistical methods to progressively take 
the data set apart so that different aspects can be examd. for both general patterns 
and very specific effects. Unfortunately, these data tables are often corrupted with 
extreme values (outliers), missing values, and non-normal distributions that preclude 
std. anal. The authors develop a robust anal, method to address these problems. The 
benefits of this robust anal, will be both the understanding of large-scale shifts in gene 
effects and the isolation of particular sample-by-gene effects that might be either 
unusual interactions or the result of expt!. flaws. The method requires a single pass 
and does not resort to complex "cleaning" or imputation of the data table before anal. 
The authors illustrate the method with a com. data set. 
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AB The authors have previously described a microarray of cluster of differentiation 
(CD) antibodies that enables concurrent detn. of more than 60 CD antigens on 
leukocytes. This procedure does not require protein purifn. or labeling, or a secondary 
detection system. Whole cells are captured by a microarray of 10 nL antibody dots 
immobilized on a nitrocellulose film on a microscope slide. Distinct patterns of cell 
binding are obsd. for different leukemias or lymphomas. These hematol. malignancies 
arise from precursor cells of T* or B-lymphocytic, or myeloid lineages of hematopoiesis. 
The dot patterns obtained from patients are distinct from those of peripheral blood 
leukocytes from normal subjects. This microarray technol. has recently undergone a no. 
of refinements. The microarray now contains more CD antibodies, and a scanner for 
imaging dot patterns and software for data anal, provide an extensive 
immunophenotype sufficient for diagnosis of common leukemias. The technol. is being 
evaluated for diagnosis of leukemias with parallel use of conventional diagnostic 
criteria. 
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AB Models of different patterns of genetic interactions were formulated and used in 
methods to det. the architecture of genetic interactions from mRNA expression levels 
measured in microarray expts. The methods can be used to screen biol. systems to 
identify which systems are candidates for therapeutic intervention. Also provided are 
machine readable storage and systems for using the disclosed models and methods. 

L6 ANSWER 66 OF 407 CAPLUS COPYRIGHT 2006 ACS on STN 
AN 2003:912713 CAPLUS 
DN 139:361250 

TI Method and system for normalization of micro array data based on local 

normalization of rank-ordered, globally normalized data 

IN Wolber, Paul K.; Shannon, Karen W.; Fulmer-Smentek, Stephanie B.; Troup, 

Charles D.; Amorese, Douglas A.; Sampas, Nicholas M.; Ghosh, Srinka; Connell, Scott 

D. 

PA USA 

SO U.S. Pat. Appl. Publ., 38 pp. CODEN: USXXCO 
DT Patent 
LA English 



Page 10 of 63 



Serial No. 10/501,848 
STN SEARCH - a 



FAN.CNT 1 PATENT NO. KIND DATE APPLICATION NO. DATE 



PI US 2003215807 Al 20031120 US 2002-143547 20020509 
PRAI US 2002-143547 20020509 

AB A method and system for normalizing two or more mol. array data sets. Input mol. 
array data sets are sep. globally normalized by, for example, dividing the feature-signal 
magnitudes of each data set by the geometric mean of the feature-signal magnitudes 
of the data set. The globally normalized feature signal magnitudes within each data set 
are ranked in ascending order. A numeric function is created that relates feature-signal 
magnitudes of the data sets. Only a subset of the features, obtained by selecting 
features that are similarly ranked in the sep. feature-signal-magnitude rankings for the 
data sets, is used to construct the numeric function. The numeric function is smoothed 
by one of many possible different smoothing procedures. The smoothed numeric 
function is used to rescale the feature-signal magnitude in one data set to the feature- 
signal magnitude of another data set, or to normalize the data sets to one another by 
distributing correction terms amongst the feature-signal magnitudes for a feature in 
each data set. 
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AB A review. The authors' purpose is to highlight some of the past and potential 
future uses of microarray in nutrition research, while also commenting on some aspects 
of the design conduct and anal, of microarray data that will leave to improved data 
quality. In this review article the authors outline some of the aspects of microarray 
experimentation that must be considered before and during these expts. These topics 
include: identification of the expt.'s objective (hypothesis), the exptJ. design, sample 
size, statistical anal., data verification, data handling, and exptl. interpretation. In 
order to illustrate the principles the authors outline in this article the authors use the 
methods to layout the design of a microarray expt. to study one aspect of the 
observation that a diet high in soy is assocd. with lower rates of breast cancer. 
Microarrays are a very powerful tool for studying virtually every nutrition-related 
disease and trait and can provide valuable insights that are not obtainable with other 
techniques. However, unless nutrition researchers conduct their studies with scientific 
hard-mindedness, the studies will be of lower power at least if not completely 
misleading. 
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AB The availability of high throughput methods for measurement of mRNA conens. 
makes the reliability of conclusions drawn from the data and global quality control of 
samples and hybridization important issues. We address these issues by an 
information theoretic approach, applied to discretized expression values in replicated 
gene expression data. The approach yields a quant, measure of two important 
parameter classes: First, the probability P(.sigma.|S) that a gene is in the biol. state 
.Sigma, in a certain variety, given its obsd. expression S in the samples of that variety. 
Second, sample specific error probabilities which serve as consistency indicators of the 
measured samples of each variety. The method and its limitations are tested on gene 
expression data for developing murine B-cells and a t-test is used as ref. On a set of 
known genes it performs better than the t-test despite the crude discretization into only 
two expression levels. The consistency indicators, i.e. the error probabilities, correlate 
well with variations in the biol. material and thus prove efficient. The proposed method 
is effective in detg. differential gene expression and sample reliability in replicated 
microarray data. Already at two discrete expression levels in each sample, it gives a 
good explanation of the data and is comparable to std. techniques. 
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AB Genomic studies of complex tissues pose unique anal, challenges for assessment 
of data quality, performance of statistical methods used for data extn., and detection of 
differentially expressed genes. Ideally, to assess the accuracy of gene expression anal, 
methods, one needs a set of genes which are known to be differentially expressed in 
the samples and which can be used as a "gold std.". The authors introduce the idea of 
using sex-chromosome genes as an alternative to spiked-in control genes or 
♦♦♦simulations*** for assessment of *** microarray*** data and anal, methods. 
Expression of sex-chromosome genes were used as true internal biol. controls to 
compare alternate probe-level data extn. algorithms (Microarray Suite 5.0 [MAS5.0], 
Model Based Expression Index [MBEI] and Robust Multi-array Av. [RMA]), to assess 
microarray data quality and to establish some statistical guidelines for analyzing large- 
scale gene expression. These approaches were implemented on a large new dataset of 
human brain samples. RMA-generated gene expression values were markedly less 
variable and more reliable than MAS5.0 and MBEI-derived values. A statistical 
technique controlling the false discovery rate was applied to adjust for multiple testing, 
as an alternative to the Bonferroni method, and showed no evidence of false neg. 
results. Fourteen probe sets, representing nine Y- and two X-chromosome linked 
genes, displayed significant sex differences in brain prefrontal cortex gene expression. 
In this study, the authors have demonstrated the use of sex genes as true biol. internal 
controls for genomic anal, of complex tissues, and suggested anal, guidelines for 
testing alternate oligonucleotide microarray data extn. protocols and for adjusting 
multiple statistical anal, of differentially expressed genes. The results also provided 
evidence for sex differences in gene expression in the brain prefrontal cortex, 
supporting the notion of a putative direct role of sex-chromosome genes in 
differentiation and maintenance of sexual dimorphism of the central nervous system. 
Importantly, these anal, approaches are applicable to all microarray studies that include 
male and female human or animal subjects. 
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AB A potential benefit of profiling of tissue samples using microarrays is the 
generation of mol. fingerprints that will define subtypes of disease. Hierarchical 
clustering has been the primary anal, tool used to define disease subtypes from 
microarray expts. in cancer settings. Assessing cluster reliability poses a major 
complication in analyzing output from clustering procedures. While most work has 
focused on estg. the no. of clusters in a dataset, the question of stability of individual- 
level clusters has not been addressed. We address this problem by developing cluster 
stability scores using subsampling techniques. These scores exploit the redundancy in 
biol. discriminatory information on the chip. Our approach is generic and can be used 
with any clustering method. We propose procedures for calcg. cluster stability scores 
for situations involving both known and unknown nos. of clusters. We also develop 
cluster-size adjusted stability scores. The method is illustrated by application to data 
three cancer studies; one involving childhood cancers, the second involving B-cell 
lymphoma, and the final is from a malignant melanoma study. Code implementing the 
proposed analytic method can be obtained at the second author's website. 
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AB Microarray technol. allows the monitoring of expression levels for thousands of 
genes simultaneously. This novel technique helps us to understand gene regulation as 
well as gene by gene interactions more systematically. In the microarray expt., 
however, many undesirable systematic variations are obsd. Even in replicated expt., 
some variations are commonly obsd. Normalization is the process of removing some 
sources of variation which affect the measured gene expression levels. Although a no. 
of normalization methods have been proposed, it has been difficult to decide which 
methods perform best. Normalization plays an important role in the earlier stage of 
microarray data anal. The subsequent anal, results are highly dependent on 
normalization. In this paper, we use the variability among the replicated slides to 



Page 11 of 63 



Serial No. 10/501,848 
STN SEARCH - a 



compare performance of normalization methods. We also compare normalization 
methods with regard to bias and mean square error using simulated data. Our results 
show that intensity-dependent normalization often performs better than global 
normalization methods, and that linear and nonlinear normalization methods perform 
similarly. These conclusions are based on anal, of 36 cDNA microarrays of 3,840 genes 
obtained in an expt. to search for changes in gene expression profiles during neuronal 
differentiation of cortical stem cells. Simulation studies confirm our findings. 
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AB Invasive lobular and ductal breast tumors have distinct histologies and din. 
presentation. Other than altered expression of E-cadherin, little is known about the 
underlying biol. that distinguishes the tumor subtypes- We used cDNA microarrays to 
identify genes differentially expressed between lobular and ductal tumors. 
Unsupervised clustering of tumors failed to distinguish between the two subtypes. 
Prediction anal, for microarrays (PAM) was able to predict tumor type with an accuracy 
of 93.7%. Genes that were significantly differentially expressed between the two 
groups were identified by MaxT permutation anal, using t tests (20 cDNA clones and 10 
unique genes), significance anal, for microarrays (33 cDNA clones and 15 genes, at an 
estd. false discovery rate of 2%), and PAM (31 cDNAs and 15 genes). There were 8 
genes identified by all three of these related methods (E-cadherin, survivin, cathepsin 
B, TPIl, SPRYl, SCYA14, TFAP2B, and thrombospondin 4), and an addnl. 3 that were 
identified by significance anal, for microarrays and PAM (osteopontin, HLA-G, and 
CHC1). To validate the differential expression of these genes, 7 of them were tested 
by real-time quant. PCR, which verified that they were differentially expressed in 
lobular vs. ductal tumors. In conclusion, specific changes in gene expression 
distinguish lobular from ductal breast carcinomas. These genes may be important in 
understanding the basis of phenotypic differences among breast cancers. 
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AB Numerous DNA microarray hybridization expts. have been performed in yeast over 
the last years using either synthetic oligonucleotides or PCR-amplified coding 
sequences as probes. The design and quality of the microarray probes are of crit. 
importance for hybridization expts. as well as subsequent anal, of the data. We 
present here a novel design of Saccharomyces cerevisiae microarrays based on a 
refined annotation of the genome and with the aim of reducing cross-hybridization 
between related sequences. An effort was made to design probes of similar lengths, 
preferably located in the 3'-end of reading frames. The sequence of each gene was 
compared against the entire yeast genome and optimal sub-segments giving no 
predicted cross-hybridization were selected. A total of 5660 novel probes (more than 
97% of the yeast genes) were designed. For the remaining 143 genes, cross- 
hybridization was unavoidable. Using a set of 18 deletant strains, we have exptl. 
validated our cross-hybridization procedure. Sensitivity, reproducibility and dynamic 
range of these new microarrays have been measured. Based on this experience, we 
have written a novel program to design long oligonucleotides for microarray 
hybridizations of complete genome sequences. A validated procedure to predict cross- 
hybridization in microarray probe design was defined in this work. Subsequently, a 
novel Saccharomyces cerevisiae microarray (which minimizes cross-hybridization) was 
designed and constructed. Arrays are available at Eurogentec S. A. Finally, we 
propose a novel design program, OliD, which allows automatic oligonucleotide design 
for microarrays. The OliD program is available from authors. 
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AB Unsupervised anal, of microarray gene expression data attempts to find biol. 
significant patterns within a given collection of expression measurements. For 
example, hierarchical clustering can be applied to expression profiles of genes across 
multiple expts., identifying groups of genes that share similar expression profiles. 
Previous work using the support vector machine supervised learning algorithm with 
microarray data suggests that higher-order features, such as pairwise and tertiary 
correlations across multiple expts., may provide significant benefit in learning to 
recognize classes of co-expressed genes. We describe a generalization of the 
hierarchical clustering algorithm that efficiently incorporates these higher-order 
features by using a kernel function to map the data into a high-dimensional feature 
space. We then evaluate the utility of the kernel hierarchical clustering algorithm using 
both internal and external validation. The expts. demonstrate that the kernel 
representation itself is insufficient to provide improved clustering performance. We 
conclude that mapping gene expression data into a high-dimensional feature space is 
only a good idea when combined with a learning algorithm, such as the support vector 
machine that does not suffer from the curse of dimensionality. 
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AB To evaluate microarray data, clustering is widely used to group biol. samples or 
genes. However, problems arise when comparing heterologous databases. As the 
clustering algorithm searches for similarities between expts., it will most likely first sep. 
the data sets, masking relationships that exist between samples from different 
databases. We developed a program, Venn Mapper, to calc. the statistical significance 
of the no. of co-occurring differentially expressed genes in any of the two expts. For 
proof of principle, we analyzed a heterologous data set of 170 microarrays including 
breast and prostate cancer microarray analyses. Significant overlap was found in an 
unsupervised anal, between metastasized prostate cancer and metastasized breast 
cancer and BRCA mutated breast cancer. A comparison between single microarray 
data and the averaged breast and prostate data sets was also evaluated. This anal, 
suggests that genes expressed higher in stromal cells are also implicated in metastatic 
prostate cancer and BRCA mutated breast cancer. The Venn Mapper program identifies 
overlaps between samples from heterologous data sets and directly exts. the genes 
responsible for the overlap. 
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AB Signal data from DNA-microarray ("chip") technol. can be noisy; i.e., the signal 
variation of one gene on a series of repetitive chips can be substantial. It is becoming 
more and more recognized that a sufficient no. of chip replicates has to be made in 
order to sep. correct from incorrect signals. To reduce the systematic fraction of the 
noise deriving from pipetting errors, from different treatment of chips during 
hybridization, and from chip-to-chip manufg. variability, normalization schemes are 
employed. We present here an iterative nonpara metric nonlinear normalization scheme 
called simultaneous alternating conditional expectation (sACE), which is designed to 
maximize correlation between chip repeats in all-chip-against-all space. We tested 
sACE on 28 expts. with 158 Affymetrix one-color chips. The procedure should be 
equally applicable to other DNA-microarray technologies, e.g., two-color chips. We 
show that the redn. of noise compared to a simple normalization scheme like the 
widely used linear global normalization leads to fewer false-pos. calls, i.e., to fewer 
genes which have to be laboriously confirmed by independent methods such as 
TaqMan or quant. PCR. 
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AB DNA array technol. now allows an enormous amt. of expression data to be 
obtained. For large-scale gene profiling enterprises, this is of course welcome. 
However, the scientist interested in follow-up studies of a handful of differentially 
expressed genes may find it hard to sift through the vast datasets to pinpoint genes 
with the most desirable and reliable behaviors. Here, we present the methodol. we 
have employed to discover genes differentially expressed in the adult mouse brain. We 
first used Affymetrix microarrays to compare gene expression from five different brain 
regions: the amygdala, cerebellum, hippocampus, olfactory bulb, and periaqueductal 
gray. Second, we identified genes differentially expressed within three distinct 
amygdala subnuclei. In this case, the tissue was microdissected by laser-capture to 
minimize contamination from adjacent subnudei, and extd. RNA was subjected to three 
rounds of linear amplification prior to hybridization to the microarrays. To select 
candidate genes, we developed a custom algorithm to identify those genes with the 
most robust changes in expression across different replicate samples. Confirmation of 
expression patterns with in situ hybridization uncovered further criteria to consider in 
the selection process. 
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AB Methods are presented for detecting differential expression using statistical 
hypothesis testing methods including anal, of variance (ANOVA). Practicalities of exptl. 
design, power, and sample size are discussed. Methods for multiple testing correction 
and their application are described. Instructions for running typical analyses are given 
in the R programming environment. R code and the sample data set used to generate 
the examples are available at http://microarray.cpmc.columbia.edu 
/pa vlid is/ pub/aovmethods/ . 
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AB Normalization means to adjust microarray data for effects which arise from 
variation in the technol. rather than from biol. differences between the RNA samples or 
between the printed probes. This paper describes normalization methods based on the 
fact that dye balance typically varies with spot intensity and with spatial position on the 
array. Print-tip loess normalization provides a well-tested general purpose 
normalization method which has given good results on a wide range of arrays. The 
method may be refined by using quality wts. for individual spots. The method is best 
combined with diagnostic plots of the data which display the spatial and intensity 
trends. When diagnostic plots show that biases still remain in the data after 
normalization, further normalization steps such as plate-order normalization or scale- 
normalization between the arrays may be undertaken. Composite normalization may 
be used when control spots are available which are known to be not differentially 
expressed. Variations on loess normalization include global loess normalization and 
two-dimensional normalization. Detailed commands are given to implement the 
normalization techniques using freely available software. 
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AB The authors developed an online microfluidic sensing device with an interdigitated 
array (IDA) electrode and a micro pre-reactor for the real-time monitoring of blood 



catecholamine (CA) and succeeded in the highly sensitive detection of dopamine (DA) 
in the presence of L-ascorbic acid (AA). The authors' device exhibits the lowest 
detection limit (110 .+-. 10 pM (S/N = 3)), of reported catecholamine sensors. The 
improvement in sensitivity results from the high redox cycling of DA and the increase in 
the mass transfer rate per unit time onto the IDA electrode achieved by the flow 
measurement. The pre-reactor was integrated upstream in the micro flow channel to 
eliminate AA. A large no. of rectangular shaped micropillars, which were modified with 
ascorbate oxidase, were formed in the pre-reactor to increase the surface area. The 
flow was disturbed by the two dimensional micropillar arrangement. This structure 
enables us to increase the elimination efficiency for AA. As a result, we achieved both 
the continuous and highly selective detection of 1 nM DA with complete elimination of 
10 .mu.M AA in the sample soln. without employing any selective membrane such as 
Nafion, whose use reduces sensitivity due to the low diffusion coeff. of DA inside the 
membrane. 
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AB Gene microarrays are becoming a key tool for the anal, of changes in gene 
expression in a variety of conditions. Use of microarrays to analyze drug responses has 
mainly been restricted to comparing treated vs. untreated samples at a few time 
points. Such data do not permit the use of another important tool, 
pharmacokinetic/pharmacodynamic (PK/PD) modeling. Such modeling requires the 
simultaneous anal, of pharmacokinetic data along with time series data on dynamic 
responses. This report describes data obtained from two extended microarray time 
series (rat liver and skeletal muscle) for the in vivo responses to a single bolus dose of 
methyiprednisolone that are uniquely available online in a single gene query format. 
Use of these data does not require any a priori knowledge or software normally 
necessary for the anal, of microarray data. Since the pharmacokinetic data and 
receptor model have been published, the results are amenable to PK/PD and 
pharmacogenomic evaluation. 
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AB As structural and functional genomics efforts provide the biol. community with 
ever-broadening sets of interrelated data, the need to explore such complex 
information for subtle relationships expands. We present wCLUTO, a Web-enabled 
version of the stand-alone application CLUTO, designed to apply clustering methods to 
genomic information. Its first application is focused on the clustering transcriptome 
data from microarrays. Data can be uploaded by the user into the clustering tool, a 
choice of several clustering methods can be made and configured, and data are 
presented to the user in a variety of visual formats, including a three-dimensional 
"mountain" view of the clusters. Parameters can be explored to rapidly examine a 
variety of clustering results, and the resulting clusters can be downloaded either for 
manipulation by other programs or to be saved in a format for publication. 
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AB In the first part of this paper the author presented an efficient, robust and 
completely automated algorithm for spot and block indexing in microarray images with 
rectangular grids. Although the rectangular grid is currently the most common type of 
grouping the probes on microarray slides, there is another microarray technol. based 
on bundles of optical fibers where the probes are packed in hexagonal grids. The 
hexagonal grid provides both advantages and drawbacks over the std. rectangular 
packing and of course requires adaptation and/or modification of the algorithm of spot 
indexing presented in the first part of the paper. In the second part of the paper the 
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author presents a version of the spot indexing algorithm adapted for microarray images 
with spots packed in hexagonal structures. The algorithm is completely automated, 
works with hexagonal grids of different types and with different parameters of grid 
spacing and rotation as well as spot sizes. It can successfully trace the local and global 
distortions of the grid, including non-orthogonal transformations. Similar to the 
algorithm from part I, it scales linearly with the grid size, the time complexity is O(M), 
where M is total no. of grid points in hexagonal grid. The algorithm has been tested 
both on CCD and scanned images with spot expression rates as low as 2%. The 
processing time of an image with about 50 000 hex grid points was less than a second. 
For images with high expression rates (.apprx.90%) the registration time is even 
smaller, around a quarter of a second. 
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AB We describe two sets of programs for expediting routine tasks in anal, of cDNA 
microarray data and promoter sequences. The first set permits bad data points to be 
flagged with respect to a no. of parameters and performs normalization in three 
different ways. It allows combining of result files into comprehensive data sets, 
evaluation of the quality of both tech. and biol. replicates and row and/or column 
standardization or data matrixes. The second set supports mapping ESTs in the 
genome, identifying the corresponding genes and recovering their promoters, analyzing 
promoters for transcription factor binding sites, and visual representation of the results. 
The programs are designed primarily for Arabidopsis thaliana researchers, but can be 
adapted readily for other model systems. 
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AB Most methods of analyzing microarray data or doing power calcns. have an 
underlying assumption of const, variance across all levels of gene expression. The 
most common transformation, the logarithm, results in data that have const, variance 
at high levels but not at low levels. Rocke and Durbin showed that data from spotted 
arrays fit a two-component model and Durbin, Hardin, Hawkins, and Rocke, Huber et 
al. and Munson provided a transformation that stabilizes the variance as well as 
symmetrizes and normalizes the error structure. We wish to evaluate the applicability 
of this transformation to the error structure of GeneChip microarrays. We demonstrate 
in an example study a simple way to use the two-component model of Rocke and 
Durbin and the data transformation of Durbin, Hardin, Hawkins and Rocke, Huber et al. 
and Munson on Affymetrix GeneChip data. In addn. we provide a method for 
normalization of Affymetrix GeneChips simultaneous with the detn. of the 
transformation, producing a data set without chip or slide effects but with const, 
variance and with sym. errors. This transformation/normalization process can be 
thought of as a machine calibration in that it requires a few biol. const, replicates of 
one sample to det. the const, needed to specify the transformation and normalize. It is 
hypothesized that this const, needs to be found only once for a given technol. in a lab, 
perhaps with periodic updates. It does not require extensive replication in each study. 
Furthermore, the variance of the transformed pilot data can be used to do power 
calcns. using std. power anal, programs. 
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AB Theor. considerations suggest that current microarray screening algorithms may 
fail to detect many true differences in gene expression (Type II analytic errors). We 
assessed 'false neg.' error rates in differential expression analyses by conventional 
linear statistical models (e.g. t-test), microarray-adapted variants (e.g. SAM, Cyber-T), 
and a novel strategy based on hold-out cross-validation. The latter approach employs 
the machine-learning algorithm Patient Rule Induction Method (PRIM) to infer min. 
thresholds for reliable change in gene expression from Boolean conjunctions of fold- 



induction and raw fluorescence measurements. Monte Carlo analyses based on four 
empirical data sets show that conventional statistical models and their microarray- 
adapted variants overlook more than 50% of genes showing significant up-regulation. 
Conjoint PRIM prediction rules recover approx. twice as many differentially expressed 
transcripts while maintaining strong control over false-pos. (Type I) errors. As a result, 
expti. replication rates increase and total analytic error rates decline. RT-PCR studies 
confirm that gene inductions detected by PRIM but overlooked by other methods 
represent true changes in mRNA levels. PRIM-based conjoint inference rules thus 
represent an improved strategy for high-sensitivity screening of DNA microarrays. 
RE.CNT 27 THERE ARE 27 CITED REFERENCES AVAILABLE FOR THIS RECORD 
ALL CITATIONS AVAILABLE IN THE RE FORMAT 

L6 ANSWER 87 OF 407 CAPLUS COPYRIGHT 2006 ACS on STN 
AN 2003:819231 CAPLUS 
DN 140:36589 

TT Representational oligonucleotide microarray analysis: A high-resolution method to 
detect genome copy number variation 

AU Lucito, Robert; Healy, John; Alexander, Joan; Reiner, Andrew; Esposito, Diane; 

Chi, Maoyen; Rodgers, Linda; Brady, Amy; Sebat, Jonathan; Troge, Jennifer; West, 

Joseph A.; Rostan, Seth; Nguyen, Ken C. Q.; Powers, Scott; Ye, Kenneth Q.; Olshen, 

Adam; Venkatraman, Ennapadam; Norton, Larry; Wigler, Michael 

CS Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA 

SO Genome Research (2003), 13(10), 2291-2305 CODEN: GEREFS; ISSN: 1088-9051 

PB Cold Spring Harbor Laboratory Press 

DT Journal 

LA English 

AB We have developed a methodol. we call ROMA (representational oligonucleotide 
microarray anal.), for the detection of the genomic aberrations in cancer and normal 
humans. By arraying oligonucleotide probes designed from the human genome 
sequence, and hybridizing with "representations" from cancer and normal cells, we 
detect regions of the genome with altered "copy no.". We achieve an a v. resoln. of 30 
kb throughout the genome, and resolns. as high as a probe every 15 kb are practical. 
We illustrate the characteristics of probes on the array and accuracy of measurements 
obtained using ROMA. Using this methodol., we identify variation between cancer and 
normal genomes, as well as between normal human genomes. In cancer genomes, we 
readily detect amplifications and large and small homozygous and hemizygous 
deletions. Between normal human genomes, we frequently detect large (100 kb to 1 
Mb) deletions or duplications. Many of these changes encompass known genes. ROMA 
will assist in the discovery of genes and markers important in cancer, and the discovery 
of loci that may be important in inherited predispositions to disease. 
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AB The present work describes a complete probe design software system for 
oligonucleotide microarrays based on Kane's research on probe sensitivity and 
specificity (Kane's rule). Combining Kane's rule and traditional criteria for probe design 
we constructed MProbe, the software system for oligonucleotide microarrays using 
Java. The general criteria for probe design are: (1) probes may have different lengths 
that range from 20 to 100 bases; (2) they should have a similar melting temp. fT m) or 
GC content; (3) they should not contain stable secondary structures; and (4) they 
abide by Kane's rule. 
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AB Tissue microarrays are increasingly important tools that bring high-throughput 
technol. to traditional pathol. labs. In many cases, each spot on a tissue microarray is 
scored by a skilled pathologist and recorded manually. TAD consists of an Active 
Server Page web interface to a relational database that automates recording scores and 
linking them with din. data for future interpretation. TAD is an open source application 
that can be installed locally. 

RE.CNT 9 THERE ARE 9 CITED REFERENCES AVAILABLE FOR THIS RECORD 
ALL CITATIONS AVAILABLE IN THE RE FORMAT 

L6 ANSWER 90 OF 407 CAPLUS COPYRIGHT 2006 ACS on STN 
AN 2003:804603 CAPLUS 
DN 140:36577 



Page 14 of 63 



Serial No. 10/501,848 
STN SEARCH - a 



TI A software package for cDNA microarray data normalization and assessing 
confidence intervals 

AU Hyduke, Daniel R.; Rohlin, Lars; Kao, Katy C; Liao, James C. 

CS Department of Chemical Engineering, University of California at Los Angeles, CA, 

USA 

SO OMICS (2003), 7(3), 227-234 CO DEN: OMICAE; ISSN: 1536-2310 
PB Mary Ann Uebert, Inc. 
DT Journal 
LA English 

AB DNA microarray data are affected by variations from a no. of sources. Before these 
data can be used to infer biol. information, the extent of these variations must be 
assessed. Here we describe an open source software package, IcDNA, that provides 
tools for filtering, normalizing, and assessing the statistical significance of cDNA 
microarray data. The program employs a hierarchical Bayesian model and Markov 
Chain Monte Carlo simulation to est. gene-specific confidence intervals for each gene in 
a cDNA microarray data set. This program is designed to perform these primary anal, 
operations on data from two-channel spotted, or in situ synthesized, DNA microarrays. 
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AB A method for mapping complex trait genes using cDNA ***microarray*** and 
mol. marker data jointly is presented and illustrated via ***simulation*** . We 
introduce a novel approach for ***simulating*** phenotypes and genotypes 
conditionally on real, publicly available, *** microarray*** data. The model assumes 
an underlying continuous latent variable (liability) related to some measured cDNA 
expression levels. Partial least-squares logistic regression is used to est. the liability 
under several scenarios where the level of gene interaction, the gene effect, and the 
no. of cDNA levels affecting liability are varied. The results suggest that: (1) the 
usefulness of microarray data for gene mapping increases when both the no. of cDNA 
levels in the underlying liability and the QTL effect decrease and when genes are 
coexpressed; (2) the correlation between estd. and true liability is large, at least under 
our simulation settings; (3) it is unlikely that cDNA clones identified as significant with 
partial least squares (or with some other technique) are the true responsible cDNAs, 
esp. as the no. of clones in the liability increases; (4) the no. of putatively significant 
cDNA levels increases critically if cDNAs are coexpressed in a cluster (however, the 
proportion of true causal cDNAs within the significant ones is similar to that in a no- 
coexpression scenario); and (5) data redn. is needed to smooth out the variability 
encountered in expression levels when these are analyzed individually. 
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AB The kinetics of hybridization on the oligonucleotide microchip with gel pads is 
studied both theor. and exptl. The monitoring of kinetics was performed with the 
measurements of fluorescence intensity produced by the labeled target 
oligonucleotides. As is shown, the hybridization time depends on the stability of the 
formed duplexes, the concns. of target and probe oligonucleotides, and the diffusion of 
target oligonucleotides in soln. and gel pad. The initial stage of hybridization is detd. 
by the flow of target oligonucleotides from soln., then, followed by the diffusive 
propagation with approx. const, concn. of oligonucleotides at the boundary of gel pad 
and, finally, by the exponential satn. The theor. predictions of hybridization kinetics 
reveal a good correspondence with the exptl. results and may be used for the choice of 
the optimal hybridization conditions. The possible applications of kinetic hybridization 
curves to the discrimination problems and assessment of diffusion coeffs. in gel pads 
are briefly discussed. Finally, we discuss the relationships between the binding kinetics 
and the general functioning of biomol. microchips. 
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AB A review. The upcoming availability of public microarray repositories and of large 
compendia of gene expression information opens up a new realm of possibilities for 
microarray data anal. An essential challenge is the efficient integration of microarray 
data generated by different research groups on different array platforms. This review 
focuses on the problems assocd. with this integration, which are: (1) the efficient 
access to and exchange of microarray data; (2) the validation and comparison of data 
from different platforms (cDNA and short and long oligonucleotides); and (3) the 
integrated statistical anal, of multiple data sets. 
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AB A review. In a recent issue of PNAS, Wright et al. proposed a statistical model 
that can be used to translate exptl. results across microarray platforms. The model is 
based on a linear predictor score (LPS) applied to hierarchical clustering results. The 
model was used to reanalyze oligonucleotide microarray data from a previous study of 
diffuse large B cell lymphoma tumors, and sep. the tumor samples into three groups 
corresponding to distinct clin. outcomes. Cross-validation of gene expression results 
among data sets generated in particular types of cancer by using methods such as 
those described by Wright et al. should help to define the genes most relevant for 
disease classification, prognostics, and therapeutic targeting. 
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AB Presented here is the program ChipCheck that allows the computation of total 
hybridization equil. for hybridization expts. involving small oligonucleotide arrays. The 
calcn. requires the free energies of binding for all pairs of probes and targets as well as 
total strand concns. and probe mol. nos. ChipCheck has been tested computationally 
on microarrays with up to 100 spots and 42 target strands (4200 binding equil.). It 
arrives at solns. through iterations employing the multidimensional Newton method. 
While currently running in simulation mode only, an extension of the approach to the 
exhaustive anal, of chip results is being outlined and may be implemented in the 
future. The output displays the extent of correct and cross hybridization both 
graphically and numerically. In principle, calcg, total hybridization equil. allows for 
eliminating noise from DNA chip results and thus an improvement in sensitivity and 
accuracy. 
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AB A review. DNA microarray technol. is a high-throughput method for gaining 
information on gene function. Microarray technol. is based on deposition/synthesis, in 
an ordered manner, on a solid surface, of thousands of EST 
sequences/genes/oligonucleotides. Due to the high no. of generated datapoints, 
computational tools are essential in microarray data anal, and mining to grasp 
knowledge from exptl. results. In this review, we will focus on some of the 
methodologies actually available to define gene expression intensity measures, 
microarray data normalization, and statistical validation of differential expression. 
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AB The retrieval of useful data from spotted microarray slides requires keeping track 
of which microplate wells and DNA sample corresponds to each spot on each array 
slide. Existing approaches are closely coupled with the type of arrayer in use and are 
computer operating-system-specific. To support the ***microarray*** researcher 
community at large who use different arrayers and ***computer*** platforms, 
increased flexibility, generality, and portability of these approaches are required. In this 
paper, we describe a general algorithm that correlates the well positions of DNA 
samples In each microplate to the positions of the spots on each array slide. Based on 
this algorithm, we have implemented a flexible and platform-independent program 
named MicroArray Convolutor (MAC). MAC provides a Web soln. allowing the user to 
import a text file that identifies the DNA samples and their well locations and to select a 
transformation method that converts data in 96- well plate format into 384- well plate 
format. It also specifies the output format of the array lists dependant on the 
configuration of the array platform as well as the downstream anal, software chosen 
for the array. MAC and its source code can be accessed via the following Web address: 
http://ymd.med.yale.edu/kei- cgi/kc_mac dev8.pl. 
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AB The present invention relates to the use of DNA microarrays for gene expression 
profiling or detection of single nucleotide polymorphisms assocd. with disease and 
methods for diagnosis. Gene expression, gene representation, or the SNPs of genes, 
from samples of blood, bodily fluids, or affected parts collected from patients are 
analyzed and stored in a database. With such a system configuration as described 
above, it is possible to perform the preprocessing of the aforementioned samples and 
gene detection within the same container and the anal, of the resulting data is 
performed without interruption and completed within an hour. 
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AB A Tcl/Tk-based application called GenoMap, a viewer for genome-wide map of 
microarray expression data within a circular bacterial genome, is described. An 
interactive interface facilitates easy identification of the expressed region. This 
software is also used for drawing genome-wide quant, data. 
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AB A review. Several "high throughput methods" have been introduced into research 
and routine labs, during the past decade. Providing a new approach to the anal, of 



genomic alterations and RNA or protein expression patterns, these new techniques 
generate a plethora of new data in a relatively short time, and promise to deliver dues 
to the diagnosis and treatment of human cancer. Along with these revolutionary 
developments, new tools for the interpretation of these large sets of data became 
necessary and are now widely available. Tissue microarray (TMA) technol. is one of 
these new tools. It is based on the idea of applying miniaturization and a high 
throughput approach to the anal, of intact tissues. The potential and the scientific 
value of TMAs in modern research have been demonstrated in a logarithmically 
increasing no. of studies. The spectrum for addnl. applications is widening rapidly, and 
comprises quality control in histotechnol., longterm tissue banking, and the continuing 
education of pathologists. This review covers the basic tech. aspects of TMA prodn. 
and discusses the current and potential future applications of TMA technol. 
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AB Summary: Microarray technol. is now routinely used to monitor genome-wide 
expression profiles. However, current microarray imaging and anal, packages typically 
require manual intervention and assumptions on alignments. Unfortunately, limitations 
and assumptions are typically undisclosed and methods are not published. To facilitate 
exploration of image data, we developed SignalViewer. This paper presents a 
description of the application. Availability: SignalViewer is available at Supplementary 
information: Screenshots are available at the above location, along with downloads for 
Windows 2000 and Linux (Redhat 7.2). 
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AB Cells grow in dynamically evolving populations, yet this aspect of expts. often goes 
unmeasured. A method is proposed for measuring the population dynamics of cells on 
the basis of their mRNA expression patterns. The population's expression pattern is 
modeled as the linear combination of mRNA expression from pure samples of cells, 
allowing reconstruction of the relative proportions of pure cell types in the population. 
Application of the method, termed expression deconvolution, to yeast grown under 
varying conditions reveals the population dynamics of the cells during the cell cycle, 
during the arrest of cells Induced by DNA damage and the release of arrest in a cell 
cycle checkpoint mutant, during sporulation, and following environmental stress. Using 
expression deconvolution, cell cycle defects are detected and temporally ordered in 146 
yeast deletion mutants; six of these defects are independently exptl. validated. 
Expression deconvolution allows a reinterpretation of the cell cycle dynamics underlying 
all previous microarray expts. and can be more generally applied to study most forms 
of cell population dynamics. 
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AB Two channel microarray data often contain systematic variations that can be 
minimized by data transformation prior to further anal. The most commonly obsd. 
effects are revealed by viewing scatter plots of the logarithm of the ratio by the av. 
logarithmic intensity of the two color channels (RI plots). In this paper we present a 
general model for signal intensity data with multiple error sources. We demonstrate 
how these sources of error influence the shape of an RI plot. We then compare some 
currently available transformation strategies in terms of their mechanism and 
performance on both ***simulated*»* and real *** microarray*** data. Alinlog 
transformation is proposed to stabilize the variance of the log ratios. We also propose 
a regional smoothing method to remove variation in log ratios due to spatial 
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heterogeneity on the microarray surface. The discussed transformations represent an 
important initial step in microarray data anal, for both ratio-based and ANOVA 
methods. 
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AB The current std. correlation coeff. used in the anal, of microarray data was 
introduced by M. B. Eisen, P. T. Spellman, P. O. Brown, and D. Botstein [(1998) Proc. 
Natl. Acad. Sci. USA 95, 14863-14868]. Its formulation is rather arbitrary. We give a 
math, rigorous correlation coeff. of two data vectors based on James-Stein shrinkage 
estimators. We use the assumptions described by Eisen et al., also using the fact that 
the data can be treated as transformed into normal distributions. While Eisen et al. use 
zero as an estimator for the expression vector mean .mu., we start with the 
assumption that for each gene, .mu. is itself a zero-mean normal random variable [with 
a priori distribution N(0, .tau.2)]r and use Bayesian anal, to obtain a posteriori 
distribution of .mu. in terms of the data. The shrunk estimator for .mu. differs from 
the mean of the data vectors and ultimately leads to a statistically robust estimator for 
correlation coeffs. To evaluate the effectiveness of shrinkage, we conducted in silico 
expts. and also compared similarity metrics on a biol. example by using the data set 
from Eisen et al. For the latter, we classified genes involved in the regulation of yeast 
cell-cycle functions by computing clusters based on various definitions of correlation 
coeffs. and contrasting them against dusters based on the activators known in the 
literature. The estd. false positives and false negatives from this study indicate that 
using the shrinkage metric improves the accuracy of the anal. 
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AB An app., method, computer program, and recording media for optimization of 
gene expression profile anal, data, are presented. Gene expression level expressed in 
fluorometric data, measured expt. with DNA microarrays or DNA chips, of a 
comparative group and a control group are cor. based on a novel mathematic model. 
Based on the scatter plots thus cor., a novel X-Y axis system having an x axis 
proportional to the fluorescence intensity of genes is constructed. Next, windows each 
having a definite no. of genes are made along the X-axis and the reliability limit of the 
arbitrary risk is detd. in each window in accordance with Student's t-distribution. Then 
windows are shifted by a definite no. of genes in the X-axis direction and each 
reliability limit is detd. The plural reliability limits thus detd. are complemented by 
smoothening (spline curve) to give a reliability curve of expression variation. 
Subsequently, genes located outside the reliability curve of expression variation thus 
obtained are extd. as variation genes. 
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AB At present, limiting factors in the use of tissue microarrays (TMAs) for high- 
throughput anal, relate to the visual evaluation of the staining patterns of each of the 
individual cores in the array and to the subsequent input of the results into a database. 
Such a database is essential to correlate the data with tumor type and outcome, and to 
evaluate the performance against other markers achieved in sep. expts. So far, these 
steps are mostly performed by hand, and consequently are time-consuming and 
potentially prone to bias and errors, resp. This paper describes the use of a high- 
resoln. flat-bed scanner for digitization of TMAs with a resoln. of about 5 .times. 5 
.mu.m2. The arrays are acquired, the positions of the tissue cores are automatically 
detd., and measurement data induding the images of the individual cores are archived. 
The program provides digital zooming of arrays for interactive verification of the results 
and rapid linkage of individual core images to data sets of other markers derived from 
the same array. Performance of the system was compared to manual classification for 
a representative set of arrays contg. colorectal tumors stained with different markers. 
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AB The possibility of constructing high-d. parallel computing architectures using mol. 
electronics techno! . is explored. By employing mol. computing devices, new 
circuit/system integration could be realized. To clarify the proposed concept, an exptl. 
model of a redox microarray is presented. A first exptl. system for a redox microarray 
consists of a two-dimensional array of platinum microelectrodes to catalyze reversible 
reactions of redox-active mols. Exptl. results of active wave propagation in the redox 
microarray are presented to demonstrate the potential of mol. computing devices for 
creating artificially programmable reaction-diffusion dynamics for specific target 
applications. 
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AB We est. the no. of microarrays that is required in order to gain reliable results from 
a common type of study: the pairwise comparison of different classes of samples. We 
show that current knowledge allows for the construction of models that look realistic 
with respect to searches for individual differentially expressed genes and derive 
prototypical parameters from real data sets. Such models allow investigation of the 
dependence of the required no. of samples on the relevant parameters: the biol. 
variability of the samples within each class, the fold changes in expression that are 
desired to be detected, the detection sensitivity of the microarrays, and the acceptable 
error rates of the results. 
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AB Common heritable diseases ("complex traits") are assumed to be due to multiple 
underlying susceptibility genes. While genetic mapping methods for Mendelian 
disorders have been very successful, the search for genes underlying complex traits 
has been difficult and often disappointing. One of the reasons may be that most 
current gene-mapping approaches are still based on conventional methodol. of testing 
one or a few SNPs at a time. Here, we demonstrate a simple strategy that allows for 
the joint anal, of multiple disease-assocd. SNPs in different genomic regions. Our set- 
assocn. method combines information over SNPs by forming sums of relevant single- 
marker statistics. As previously hypothesized, we show here that this approach 
successfully addresses the "curse of dimensionality" problem- too many variables 
should be estd. with a comparatively small no. of observations. We also report results 
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of simulation studies showing that our method furnishes unbiased and accurate 
significance levels. Power calcns. demonstrate good power even in the presence of 
large nos. of nondisease assocd. SNPs. We extended our method to microarray 
expression data, where expression levels for large nos. of genes should be compared 
between two tissue types. In applications to such data, our approach turned out to be 
highly efficient. 
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AB Although gathered as continuous data, expression measurements from gene 
microarrays may be quantized before downstream anal, and modeling. This is esp. 
true for modeling gene prediction and genetic regulatory networks. Coarse quantization 
results in lower computational requirements, lower data requirements for model 
inference, and easier conceptualization. This paper proposes a mixt. model for 
binarization. For each gene, the model, composed of a sum of two distributions, is fit 
to expression data for that gene, and data points are binarized according to the model. 
The mixt. model is based on the assumption of multiplicative up-regulation. The 
proposed method is compared with mean and median binarization by comparing 
classification performance based on the binary data from the different methods. 
Classification is performed for ***simulated*** data generated from a 
***microarray*** model studied previously and for cancer data arising from two 
studies involving hereditary breast cancer and small, round blue-cell tumors of 
childhood. 
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AB We report two cDNA microarray- based applications of DNA-nanocrystal 
conjugates, single-nucleotide polymorphism (SNP) and multiallele detections, using a 
com. scanner and two sets of nanocrystals with orthogonal emissions. We focus on 
SNP mutation detection in the human p53 tumor suppressor gene, which has been 
found to be mutated in more than 50% of the known human cancers. DNA- 
nanocrystal conjugates are able to detect both SNP and single-base deletion at room 
temp, within minutes, with true-to-false signal ratios above 10. We also demonstrate 
microarray-based multiallele detection, using hybridization of multicolor nanocrystals 
conjugated to two sequences specific for the hepatitis B and hepatitis C virus, two 
common viral pathogens that inflict more than 10% of the population in the developing 
countries worldwide. The simultaneous detection of multiple genetic markers with 
microarrays and DNA-nanocrystal conjugates has no precedent and suggests the 
possibility of detecting an even greater no. of bacterial or viral pathogens 
simultaneously. 
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AB Confocal laser scanning microscopy was employed for the detn. of binding consts. 
of receptor-ligand interactions in a microarray format. Protocols for a localized 
immobilization of amine contg. substances on glass via GOPTS (3- 
glycidyloxypropyl)trimethoxysilane were optimized with respect to the detection of 
ligand binding by fluorescence. Compatibility with miniaturization by nanopipetting 
devices was ensured during all steps. The interaction of the tripeptide L-Lys-D-Ala-D- 
Ala with vancomycin immobilized on glass served as a model. To minimize 
consumption of ligand, binding consts. were detd. by stepwise titm. of binding sites. 



The binding const, of the unlabeled ligand was detd. by competitive titm. with a 
fluorescentiy labeled analog. The detd. binding consts. agreed well with those detd. by 
other techniques, previously. Labeled ligand bound stronger than the unlabeled one. 
This difference was dye-dependent. Still, binding was specific for the tripeptide moiety 
confirming that ligand and fluorescent analog competed for the same binding sites 
these results validate the detn. of binding consts. by competitive titm. The protocols 
established for confocal fluorescence detection are applicable to axially resolved 
detection modalities and screening for unlabeled ligands by competitive titrn. in 
general. 
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AB Microarray technol. has become a very important tool for studying gene 
expression profiles under various conditions. Biologists often pool RNA samples extd. 
from different subjects onto a single microarray chip to help defray the cost of 
microarray expts. as well as to correct for the tech. difficulty in getting sufficient RNA 
from a single subject. However, the statistical, tech. and financial implications of 
pooling have not been explicitly investigated. Modeling the resulting gene expression 
from sample pooling as a mixt. of individual responses, we derived expressions for the 
exptl. error and provided both upper and lower bounds for its value in terms of the 
variability among individuals and the no. of RNA samples pooled. Using "virtual" 
pooling of data from real expts. and computer simulations, we investigated the 
statistical properties of RNA sample pooling. Our study reveals that poolingbiol. 
samples appropriately is statistically valid and efficient for microarray expts. 
Furthermore, optimal pooling design(s) can be found to meet statistical requirements 
while minimizing total cost. Appropriate RNA pooling can provide equiv. power and 
improve efficiency and cost-effectiveness for microarray expts. with a modest increase 
in total no. of subjects. Pooling schemes in terms of replicates of subjects and arrays 
can be compared before expts. are conducted. 
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AB We introduce simple graphical classification and prediction tools for tumor status 
using geneexpression profiles. They are based on two dimension estn. techniques 
sliced av. variance estn. (SAVE) and sliced inverse regression (SIR). Both SAVE and 
SIR are used to infer on the dimension of the classification problem and obtain linear 
combinations of genes that contain sufficient information to predict class membership, 
such as tumor type. Plots of the estd. directions as well as numerical thresholds estd. 
from the plots are used to predict tumor classes in cDNA microarrays and the 
performance of the class predictors is assessed by cross-validation. A 
***microarray*** ***simulation*** study is carried out to compare the power and 
predictive accuracy of the two methods. The methods are applied to cDNA microarray 
data on BRCA1 and BRCA2 mutation carriers as well as sporadic tumors from Hedenfalk 
et al. (2001). All samples are correctly classified. 
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AB The occurrence of false positives and false negatives in a microarray anal, could 
be easily estd. if the distribution of p-values were approximated and then expressed as 
a mixt. of null and alternative densities. Essentially any distribution of p-values can be 
expressed as such a mixt. by extg. a uniform d. from it. A model is introduced that 
frequently describes very accurately the distribution of a set of p-values arising from an 
array anal. The model is used to obtain an estd. distribution that is easily expressed as 
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a mixt. of null and alternative densities. Given a threshold of significance, the estd. 
distribution is partitioned into regions corresponding to the occurrences of false 
positives, false negatives, true positives, and true negatives. 
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AB A statistical model is proposed for the anal, of errors in microarray expts. and is 
employed in the anal, and development of a combined normalization regime. Through 
anal, of the model and two-dye microarray data sets, this study found the following. 
The systematic error introduced by microarray expts. mainly involves spot intensity- 
dependent, feature-specific and spot position-dependent contributions. It is difficult to 
remove all these errors effectively without a suitable combined normalization operation. 
Adaptive normalization using a suitable regression technique is more effective in 
removing spot intensity-related dye bias than self-normalization, while regional 
normalization (block normalization) is an effective way to correct spot position- 
dependent errors. However, dye-flip replicates are necessary to remove feature- 
specific errors, and also allow the analyst to identify the exptl. introduced dye bias 
contained in non-self-self data sets. In this case, the bias present in the data sets may 
include both exptl. introduced dye bias and the biol. difference between two samples. 
Self-normalization is capable of removing dye bias without identifying the nature of that 
bias. The performance of adaptive normalization, on the other hand, depends on its 
ability to correctly identify the dye bias. If adaptive normalization is combined with an 
effective dye bias identification method then there is no systematic difference between 
the outcomes of the two methods. 
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AB As DNA microarrays are widely used recently, the amt. of microarray data is 
exponentially increasing. Until now, however, no domestic system is available for the 
efficient management of such data. Because the no. of exptl. data in a specific lab. is 
limited, it is necessary to avoid redundant expts. and to accumulate the results using a 
shared data management system for microarrays. In this paper, a system named 
WEMA (WEb management of MicroArrays) was designed and implemented to manage 
and process the microarray data. WEMA system was designed to include the basic 
feature of MIAME (Minimal Information About a Microarray Expt), and general data 
units were also defined in the system in order to systematically manage the data. The 
WEMA system has three main features: efficient management of microarray data, 
integration of input/ouput data, and metafile processing. The system was tested with 
actual microarray data produced by a mol. biol. lab., and we found that the biologists 
could systematically manage and easily analyze the microarray data. As a 
consequence, the researchers could reduce the cost of data exchange and 
communication. 
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AB Global transcriptome data is increasingly combined with sophisticated math, 
analyses to ext. information about the functional state of a cell. Yet the extent to which 
the results reflect exptl. bias at the expense of true biol. information remains largely 
unknown. Here we show that the spatial arrangement of probes on microarrays and 
the particulars of the printing procedure significantly affect the log-ratio data of mRNA 
expression levels measured during the Saccharomyces cerevisiae cell cyde. We present 
a numerical method that filters out these technol. -derived contributions from the 
existing transcriptome data, leading to improved functional predictions. The example 
presented here underlines the need to routinely search and compensate for inherent 
exptl. bias when analyzing systematically collected, internally consistent biol. data sets. 



RE.CNT 34 THERE ARE 34 CITED REFERENCES AVAILABLE FOR THIS RECORD 
ALL CITATIONS AVAILABLE IN THE RE FORMAT 

L6 ANSWER 119 OF 407 CAPLUS COPYRIGHT 2006 ACS on STN 
AN 2003:579645 CAPLUS 
DN 139:255887 

TI 2HAPI: a microarray data analysis system 

AU Fink, J. Lynn; Drewes, Scott; Patel, Hiren; Welsh, John B.; Masys, Daniel R.; 
Corbeil, Jacques; Gribskov, Michael 

CS San Diego Supercomputer Center, San Diego, CA, 92093-0537, USA 

SO Bioinformatics (2003), 19(11), 1443-1445 CODEN: BOINFP; ISSN: 1367-4803 

PB Oxford University Press 

DT Journal 

LA English 

AB 2HAPI (version 2 of High d. Array Pattern Interpreter) is a web-based, publicly- 
available anal, tool designed to aid researchers in microarray data anal. 2HAPI includes 
tools for searching, manipulating, visualizing, and clustering the large sets of data 
generated by microarray expts. Other features include assocn. of genes with NCBI 
information and linkage to external data resources. Unique to 2HAPI is the ability to 
retrieve upstream sequences of co-regulated genes for promoter anal, using MEME 
(Multiple Expectation-maximization for Motif Elicitation). 2HAPI is freely available at 
http://array.sdsc.edu. 
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AB Data preprocessing including proper normalization and adequate quality control 
before complex data mining is crucial for studies using the cDNA microarray technol. 
We have developed a simple procedure that integrates data filtering and normalization 
with quant, quality control of microarray expts. Previously we have shown that data 
variability in a microarray expt. can be very well captured by a quality score qcom that 
is defined for every spot, and the ratio distribution depends on qcom. Utilizing this 
knowledge, our data-filtering scheme allows the investigator to decide on the filtering 
stringency according to desired data variability, and our normalization procedure 
corrects the qcom-dependent dye biases in terms of both the location and the spread 
of the ratio distribution. In addn., we propose a statistical model for false pos. rate 
detn. based on the design and the quality of a microarray expt. The model predicts 
that a lower limit of 0.5 for the replicate concordance rate is needed in order to be 
certain of true positives. Our work demonstrates the importance and advantages of 
having a quant, quality control scheme for microarrays. 
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AB The focus of this paper is on two new normalization methods for cDNA 
microarrays. After the image anal, has been performed on a microarray and before 
differentially expressed genes can be detected, some form of normalization must be 
applied to the microarrays. Normalization removes biases towards one or other of the 
fluorescent dyes used to label each mRNA sample allowing for proper evaluation of 
differential gene expression. The two normalization methods that we present here 
build on previously described non-linear normalization techniques. We extend these 
techniques by firstly introducing a normalization method that deals with smooth spatial 
trends in intensity across microarrays, an important issue that must be dealt with. 
Secondly we deal with normalization of a new type of cDNA microarray expt. that is 
coming into prevalence, the small scale specialty or boutique' array, where large 
proportions of the genes on the microarrays are expected to be highly differentially 
expressed. The normalization methods described in this paper are available via 
http://www.pi.csiro.au/gena/ in a software suite called tRMA. 
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AB The objective of this study is to explore aspects of the statistical anal, of gene 
expression response at the muscle tissue level to varying levels of energy and protein 
in the diet. Eleven Brahman and Brahman composite steers (weighing 302 .+-. 9.8 kg, 
on av.) were allocated randomly into high- (HIGH), medium- (MED), and low- (LOW) 
quality forage diets for 27 d. After this period, a biopsy of the longissimus dorsi muscle 
was taken from each animal and total RNA was extd. to generate the labeled target for 
microarray experimentation. These targets were hybridized to a complementary ONA 
(cDNA) microarray of 9,274 probes from cattle musde and s.c. fat cDNA libraries. After 
edits, 151,904 expression intensity levels of 4,747 genes were analyzed. Emphasis was 
given to the choice of power transformation of the intensity channel readings and to 
the consistency of readings within each diet quality group. The statistical approach to 
isolate differentially expressed genes was based on model-based clustering via a mixt. 
of normal distributions estd. through maximal likelihood. The base-2 logarithm was 
found to be the optimal power transformation to normalize gene intensity levels. A 
two-sample t-statistic was defined as a measure of possible differential expression. For 
each of the three diet contrasts, HIGH vs. LOW, HIGH vs. MED, and MED vs. LOW, 
three clusters were found, two of which contained more than 94% genes with almost 
no altered gene expression levels, whereas the third cluster contained the remaining 
genes with a differential expression. Results from the HIGH vs. LOW contrast identified 
27 genes with a greater than 95% posterior probability of belonging to the duster of 
differentially expressed genes. 
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AB A method is disclosed for providing a certified biochem. profile of a biol. sample. 
The biochem. profile includes a plurality of data objects for a plurality of mol. markers. 
The method comprises: (1) providing a first data set for the plurality of mol. markers of 
the biol. sample by a first process from a first elec. signal representing a first 
unprocessed image data; (2) providing a second data set for said plurality of mol. 
markers of the biol. sample by a second process from a second elec. signal 
representing a second unprocessed image data, wherein the first process is different 
from said second process; and (3) comparing, by a computer-readable program code, 
the first and second data sets, whereby a certified biochem. profile is generated if no 
discrepancy between the first and second data sets are detected. Preferably, the steps 
of the method are coordinated by another computer-readable program code. Systems 
and computer program products embodying the method or useful in the method are 
also disclosed. 
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AB A method of analyzing DNA microarray data based on the phys. modeling of 
hybridization is presented. We demonstrate, in expti. data, a correlation between 
obsd. hybridization intensity and calcd. free energy of hybridization. Then, combining 
hybridization rate equations, calcd. free energies of hybridization, and microarray data 
for known target concns., we construct an algorithm to compute transcript concn. 
levels from microarray data. We also develop a method for eliminating outlying data 
points identified by our algorithm. We test the efficacy of these methods by comparing 
our results with an existing statistical algorithm, as well as by performing a cross- 
validation test on our model. 
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AB A method of obtaining a cor. image of a microarray includes acquiring an image of 
a microarray including a target spot, and processing the image to correct for 
background noise and chip misalignment. The method also includes analyzing the 
image to identify a target patch, edit debris, and correct for ratio bias; and detecting 
single copy no. variation in the target spot using an objective statistical anal, that 
includes a t-value statistical anal. The method provides statistically robust 
computational processes for accurately detecting genomic variation at the single copy 
level. 
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AB We describe the use of a statistical model in a genome- wide microarray-based 
yeast genetic screen performed by imposing different genetic selections on thousands 
of yeast mutants in parallel. A mixt. model is fitted to data obtained from 
oligonucleotide arrays hybridized to 20-mer oligonucleotide "barcodes" and a procedure 
based on the fitted model is used to search for mutants differentially represented under 
expti. and control conditions. The fitted stochastic model provides a way to assess 
uncertainty. We demonstrate the usefulness of the model by applying it to the problem 
of screening for components of the nonhomologous end joining (NHEJ) pathway and 
identified known components of the NHEJ pathway. 
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AB In microarray studies, an important problem is to compare a predictor of disease 
outcome derived from gene expression levels to std. clin. predictors. Comparing them 
on the same dataset that was used to derive the microarray predictor can lead to 
results strongly biased in favor of the microarray predictor. The authors propose a new 
technique called "pre-validation" for making a fairer comparison between the two sets 
of predictors. The authors study the method anal, and explore its application in a 
recent study on breast cancer. 
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AB A review. The ***computerized*** strategies of the gene anal, for DNA 
***microarrays*** were described. The outlined flow of the anal, processes of the 
cDNA microarrays anal., transcriptional profiling anal, of cancers, and gene expression 
arrays anal, of microdissected tissue samples was discussed. 
RE.CNT 54 THERE ARE 54 CITED REFERENCES AVAILABLE FOR THIS RECORD 
ALL CITATIONS AVAILABLE IN THE RE FORMAT 



Page 20 of 63 



Serial No. 10/501,848 
STN SEARCH -a 



L6 ANSWER 129 OF 407 CAPLUS COPYRIGHT 2006 ACS on STN 
AN 2003:501135 CAPLUS 
DN 139:144560 

TI GenePublisher: automated analysis of DNA microarray data 

All Knudsen, Steen; Workman, Christopher; Sicheritz-Ponten, Thomas; Friis, Carsten 

CS Center for Biological Sequence Analysis, BioCentnjm-DTU, Lyngby, 2800, Den. 
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AB GenePublisher, a system for automatic anal, of data from DNA microarray expts., 
has been implemented with a web interface. Raw data are uploaded to the server 
together with a specification of the data. The server performs normalization, statistical 
anal, and visualization of the data. The results are run against databases of signal 
transduction pathways, metabolic pathways and promoter sequences in order to ext. 
more information. The results of the entire anal, are summarized in report form and 
returned to the user. 
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TI ExpressYourself: a modular platform for processing and visualizing microarray data 
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Horak, Christine E.; Chang, Joseph T.; Snyder, Michael; Gerstein, Mark 
CS Department of Molecular Biophysics and Biochemistry, Yale University, New 
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AB DNA microarrays are widely used in biol. research; by analyzing differential 
hybridization on a single microarray slide, one can detect changes in mRNA expression 
levels, increases in DNA copy nos. and the location of transcription factor binding sites 
on a genomic scale. Having performed the expts., the major challenge is to process 
large, noisy datasets in order to identify the specific array elements that are 
significantly differentially hybridized. This normally requires aggregating different, 
often incompatible programs into a multi-step pipeline. Here the authors present 
ExpressYourself, a fully integrated platform for processing microarray data. In 
completely automated fashion, it will correct the background array signal, normalize the 
Cy5 and Cy3 signals, score levels of differential hybridization, combine the results of 
replicate expts., filter problematic regions of the array and assess the quality of 
individual and replicate expts. ExpressYourself is designed with a highly modular 
architecture so various types of microarray anal, algorithms can readily be incorporated 
as they are developed; for example, the system currently implements several 
normalization methods, including those that simultaneously consider signal intensity 
and slide location. The processed data are presented using a web-based graphical 
interface to facilitate comparison with the original images of the array slides. In 
particular, Express Yourself is able to regenerate images of the original microarray after 
applying various steps of processing, which greatly facilities identification of position- 
specific artifacts. The program is freely available for use at 
http://bioinfo.mbb.yale.edu/expressyourself. 
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AB Optimal design of oligonucleotides for microarrays involves tedious and laborious 
work evaluating potential oligonucleotides relative to a series of parameters. The 
currently available tools for this purpose are limited in their flexibility and do not 
present the oligonucleotide designer with an overview of these parameters. We 
present here a flexible tool named OligoWiz for designing oligonucleotides for multiple 
purposes. OligoWiz presents a set of parameter scores in a graphical interface to 
facilitate an overview for the user. Addnl. custom parameter scores can easily be 
added to the program to extend the default parameters: homol., .DELTA.Tm, low- 
complexity, position and GATC-only. Furthermore we present an anal, of the 
limitations in designing oligonucleotide sets that can detect transcripts from multiple 
organisms. 
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AB We present a web-based pipeline for microarray gene expression profile anal., 
GEPAS, which stands for Gene Expression Profile Anal. Suite (). GEPAS is composed of 
different interconnected modules which include tools for data pre-processing, two- 
conditions comparison, unsupervised and supervised clustering (which include some of 
the most popular methods as well as home made algorithms) and several tests for 
differential gene expression among different classes, continuous variables or survival 
anal. A multiple purpose tool for data mining, based on Gene Ontol., is also linked to 
the tools, which constitutes a very convenient way of analyzing clustering results. 
Online tutorials are available at http://bioinfo.cnio.es. 
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AB We have analyzed the detection of microcantilevers utilized in biosensing chips. 
First, the primary deflection due to the chem. reaction between the analyte mols. and 
the receptor coating, which produces surface stresses on the receptor side is analyzed. 
Oscillating flow conditions, which are the main source of turbulence in cantilever based 
biosensing chips, are found to produce substantial deflections in the microcantilever at 
relatively large frequency of turbulence. Then mech. design and optimization of 
piezoresistive cantilevers for biosensing applications is studied. Models are described 
for predicting the static behavior of cantilevers with elastic and piezoresistive layers. 
Chemo-mech. binding forces have been analyzed to understand issues of satn. over the 
cantilever surface. Furthermore, the introduction of stress concn. regions during 
cantilever fabrication has been discussed which greatly enhances the detection 
sensitivity through increased surface stress, and novel microcantilever assemblies are 
presented for the first time that can increase the deflection due to chem. reaction. 
Finally an expt. was made to demonstrate the shift of resonant frequency of cantilever 
used as biosensor. The relation between resonant frequency shift and the surface 
stress was analyzed. 
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AB An understanding of the multi-step nature of cancer as it is in the breast, as a 
series of pivotal genetic/epigenetic modifications is irrefutably a milestone in 
diagnostics, prognostics and eventually providing a cure. Here we have utilized a 
variant of anal, of variance (ANOVA) as a model for the identification and tracking of 
specific mRNA species whose transcription has been significantly altered at each grade 
in the progression of ductal carcinoma, making it possible to correlate histol. 
progression with the genetic events underlying breast cancer. We show that in the 
progression of ductal carcinomas, from grade 1 to 3, there is a redn. in the actual no. 
of mRNA spedes, which are significantly over or under expressed. We also show that 
this technique can be employed to generate differential gene expression patterns, 
whereby the combined expression profile of the tailored spectra of genes in the 
comparison of each ductal grade is sufficient to render them on clearly sep. arms of an 
array-wise hierarchical cluster dendrogram. 
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AB In this paper we derive a method for evaluating and improving techniques for 
selecting informative genes from microarray data. Genes of interest are typically 
selected by ranking genes according to a test-statistic and then choosing the top k 
genes. A problem with this approach is that many of these genes are highly correlated. 
For classification purposes it would be ideal to have distinct but still highly informative 
genes. We propose three different pre-filter methods - two based on clustering and 
one based on correlation - to retrieve groups of similar genes. For these groups we 
apply a test-statistic to finally select genes of interest. We show that this filtered set of 
genes can be used to significantly improve existing classifiers. 
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AB This invention relates to statistical anal, of differential gene expression data using 
a pairwise comparison of two different samples wherein each sample generates greater 
than 150,000 mRNA mols. Methods, computer programs and systems are provided for 
the anal, and comparison of gene frequency distributions generated by one or more 
replicate samples or by independent sampling procedures. To det. differential gene 
expression in a sample compared to another sample(s), the level of expression of a 
given mRNA in a given sample is detd. For example, the level of expression of any 
single gene in a dataset is calcd. by dividing the no, of signatures from that gene by 
the total no. of signatures for all mRNAs present in the dataset. The level of 
expression of a particular mRNA in one sample is statistically compared to the level of 
expression of the same particular mRNA in another sample. Provided is a statistical 
test of significance which is a normal approxn. test, which comprises a two-tailed test 
for the total no. of signature sequences generated from mRNA mols. in the first and 
second samples by massively parallel signature sequencing (MPSS). 

L6 ANSWER 137 OF 407 CAPLUS COPYRIGHT 2006 ACS on STN 
AN 2003:461964 CAPLUS 
DN 139:225051 

TI OligoArray 2.0: design of oligonucleotide probes for DNA microarrays using a 
thermodynamic approach 

AU Rouillard, Jean-Marie; Zuker, Michael; Gulari, Erdogan 

CS Department of Chemical Engineering, University of Michigan, H.H. Dow, Ann 

Arbor, MI, 48109, USA 

SO Nucleic Acids Research (2003), 31(12), 3057-3062 CODEN: NARHAD; ISSN: 0305- 
1048 

PB Oxford University Press 
DT Journal 
LA English 

AB There is a substantial interest in implementing bioinformatics technologies that 
allow the design of oligonucleotides to support the development of microarrays made 
from short synthetic DNA fragments spotted or in situ synthesized on slides. Ideally, 
such oligonucleotides should be totally specific to their resp. targets to avoid any cross- 
hybridization and should not form stable secondary structures that may interfere with 
the labeled probes during hybridization. We have developed OligoArray 2.0, a program 
that designs specific oligonucleotides at the genomic scale. It uses a thermodn. 
approach to predict secondary structures and to calc. the specificity of targets on chips 
for a unique probe in a mixt. of labeled probes. Furthermore, OligoArray 2.0 can adjust 
the oligonucleotide length, according to user input, to fit a narrow Tm range compatible 
with hybridization requirements. Combined with on chip oligonucleotide synthesis, this 
program makes it feasible to perform expression anal, on a genomic scale for any 
organism for which the genome sequence is known. This is without relying on cDNA or 
oligonucleotide libraries. OligoArray 2.0 was used to design 75 764 oligonucleotides 
representing 26 140 transcripts from Arabidopsis thaliana. Among this set, we provide 
at least one specific oligonucleotide for 93% of these transcripts. 
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AB Exptl. gene expression data sets, such as those generated by microarray or gene 
chip expts., typically have significant noise and complicated interconnectivities that 
make understanding even simple regulatory patterns difficult. Given these 
complications, characterizing the effectiveness of different anal, techniques to uncover 
network groups and structures remains a challenge. Generating simulated expression 
patterns with known biol. features of expression complexity, diversity and 
interconnectivities provides a more controlled means of investigating the 
appropriateness of different anal, methods. A simulation-based approach can 
systematically evaluate different gene expression anal, techniques and provide a basis 
for improved methods in dynamic metabolic network reconstruction. We have 
developed an online ***simulator*** , called eXPatGen, to generate dynamic gene 
expression patterns typical of ***microarray*** expts. eXPatGen provides a quant, 
network structure to represent key biol. features, including the induction, repression, 
and cascade regulation of mRNA (mRNA). The simulation is modular such that the 
expression model can be replaced with other representations, depending on the level of 
biol. detail required by the user. Two example gene networks, of 25 and 100 genes 
resp., were simulated. Two std. anal, techniques, clustering and PCA anal., were 
performed on the resulting expression patterns in order to demonstrate how the 
simulator might be used to evaluate different anal, methods and provide exptl. 
guidance for biol. studies of gene expression. 
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AB Given the vast amt. of gene expression data, it is essential to develop a simple and 
reliable method of investigating the fine structure of gene interaction. The author 
introduce an information geometric measure of binary random vectors and show how 
this measure reveals the fine structure of gene interaction. In particular, we propose 
an iterative procedure by using this measure (called IPIG). The procedure finds higher- 
order dependencies which may underlie the interaction between two genes of interest. 
To demonstrate the method, we investigate the interaction between the two genes of 
interest in the data from human acute lymphoblastic leukemia cells. The method 
successfully discovered biol. known findings and also selected other genes as hidden 
causes that constitute the interaction. The softwares are currently not available but 
are possibly made available in future at http://www.mns.brain.riken.go.jp/ 
~nakahara/DNA _pub.html, where all the related information is also linked. 
RE.CNT 16 THERE ARE 16 CITED REFERENCES AVAILABLE FOR THIS RECORD 
ALL CITATIONS AVAILABLE IN THE RE FORMAT 

L6 ANSWER 140 OF 407 CAPLUS COPYRIGHT 2006 ACS on STN 
AN 2003:461869 CAPLUS 
DN 139:174390 

TI Bagging to improve the accuracy of a clustering procedure 
AU Dudoit, Sandrine; Fridlyand, Jane 

CS School of Public Health, Division of Biostatistics, University of California, Berkeley, 
Berkeley, CA, 94720-7360, USA 

SO Bioinformatics (2003), 19(9), 1090-1099 CODEN: BOINFP; ISSN: 1367-4803 
PB Oxford University Press 
DT Journal 
LA English 

AB The microarray technol. is increasingly being applied in biol. and medical research 
to address a wide range of problems such as the classification of tumors. An important 
statistical question assocd. with tumor classification is the identification of new tumor 
classes using gene expression profiles. Essential aspects of this clustering problem 
include identifying accurate partitions of the tumor samples into clusters and assessing 
the confidence of cluster assignments for individual samples. Two new resampling 
methods, inspired from bagging in prediction, are proposed to improve and assess the 
accuracy of a given clustering procedure. In these ensemble methods, a partitioning 
clustering procedure is applied to bootstrap learning sets and the resulting multiple 
partitions are combined by voting or the creation of a new dissimilarity matrix. As in 
prediction, the motivation behind bagging is to reduce variability in the partitioning 
results via averaging. The performances of the new and existing methods were 
compared using ***simulated*** data and gene expression data from two recently 
published cancer ***microarray*** studies. The bagged clustering procedures were 
in general at least as accurate and often substantially more accurate than a single 
application of the partitioning clustering procedure. A valuable byproduct of bagged 
clustering are the cluster votes which can be used to assess the confidence of cluster 
assignments for individual observations. 
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AB Oligonucleotide microarrays are amongst, a set of technologies that allow for high 
throughput assessment of vast nos. of gene expressions. In order to evaluate gene 
expressions given detection limits, antibody spiking is often used providing one with an 
expression curve relating antibody treated expression and non-antibody treated 
expression. These curves can exhibit different functional shapes across chips and 
hence need to be standardized. In addn., each curve is subject to satn. effects, which 
are typically dealt with by extrapolating a linear Ht to the subset of the data not visually 
subject to satn. In this paper we introduce methods for the non-parametric 
standardization of expression curves using univariate smoothers. We also explore 
parametric methods for more efficient anal, of the standardized curves. We 
demonstrate an alternate method of parametric anal, using a weighted linear mixed 
effects model that does not arbitrarily delete data beyond an obsd. satn. point; allows 
for natural grouping of genes and provides significantly more accurate predictions than 
naive linear extrapolation. Both methodologies are studied through sets of simulations. 
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AB We report a new theor. approach to optimize the performance and quantify the 
results of gene expression oligonucleotide microarrays which are widely used in 
biomedical research. An on-array hybridization isotherm that takes into account the 
screened Coulomb repulsion between the assayed nucleic acid target and the layer of 
surface tethered oligonucleotide probes is presented. The hybridization efficiency is 
found as a function of the genomic target (sequence, length, and concn.), array 
parameters (probe sequence and length, surface probe d.), and hybridization 
conditions (temp, and buffer ionic strength). We present simple relations for the 
hybridization signal max. and the linear dynamic detection range and show explicit 
criteria for optimization. The approach is based on an extension of our recently 
published theory (Vainrub, A.; Pettitt, B. M. Phys. Rev. E 2002, 66, art. no.-041905) 
which we generalize here for the cases of target depletion effects and arbitrary target 
length. 
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AB The present invention relates to genetic markers whose expression is correlated 
with progression of chronic myelogenous leukemia (CML). Specifically, the invention 
provides 366 of markers whose expression patterns can be used to differentiate chronic 
phase individuals from those in blast crisis. The marker sets were identified by detg. 
which of .apprx.25,000 human markers had expression patterns that correlated with 
the conditions or indications. The invention relates to methods of using these markers 
to distinguish these conditions. The invention also relates to kits contg. ready-to-use 
***microarrays*** and ***computer*** software for data anal, using the 
statistical methods disclosed herein. 
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AB The different computational methods commonly used in microarray data anal, are 
described. The features of current major software packages and other tools and 
examples of applications of data anal, methods in studies of membrane transporters 
are presented. 
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AB DNA microarrays are used to produce large sets of expression measurements from 
which specific biol. information is sought. Their anal, requires efficient and reliable 
algorithms for dimensional redn., classification and annotation. We study networks of 
co-expressed genes obtained from DNA microarray expts. The math, concept of 
curvature on graphs is used to group genes or samples into clusters to which relevant 
gene or sample annotations are automatically assigned. Application to publicly 
available yeast and human lymphoma data demonstrates the reliability of the method 
in spite of its simplicity, esp. with respect to the small no. of parameters involved. We 
provide a method for automatically detg. relevant gene clusters among the many genes 
monitored with microarrays. The automatic annotations and the graphical interface 
improve the readability of the data. A C++ implementation, called Trixy, is available 
from http://tagc.unrv-mrs.fr/bioinformatics/trixy.html. 
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AB The detection of hybridization events on oligonucleotide microarrays in real time 
can be performed, using the optical principle of total internal reflection fluorescence 
(TIRF). We have investigated and compared three TIRF-sensing configurations using 
two bulk and one integrated optical planar waveguide as transducer platforms for 
oligonucleotide microarrays, which have been brought in contact with flow cells. Based 
on the ray optics model, expressions were derived for the calcn. of the intensity of the 
CCD-camera signal generated by solved fluorophores in the flow cell vol. A noise anal, 
was performed and expressions for the calcn. of the detection limit of the surface 
fluorophore d. were derived. With a bulk optical single total internal reflection 
configuration a detection limit of 3.74 mols./.mu.m2, with a bulk optical multiple total 
internal reflection configuration a detection limit of 1.83 mols./.mu.m2 and with the 
integrated optical waveguide (IOW) configuration a detection limit of 0.013 
mols./.mu.m2 was numerically estd. based on background data of the bulk vol. signal. 
The derived anal, expressions address the full system, including light source, optical 
waveguide and the detection unit and can serve as a tool for TIRF-system design. 
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AB A major problem of pattern classification is estn. of the Bayes error when only 
small samples are available. One way to est. the Bayes error is to design a classifier 
based on some classification rule applied to sample data, est. the error of the designed 
classifier, and then use this est. as an est. of the Bayes error. Relative to the Bayes 
error, the expected error of the designed classifier is biased high, and this bias can be 
severe with small samples. This paper provides a correction for the bias by subtracting 
a term derived from the representation of the estn. error. It does so for Boolean 
classifiers, these being defined on binary features. Although the general theory applies 
to any Boolean classifier, a model is introduced to reduce the no. of parameters. A key 
point is that the expected correction is conservative. Properties of the cor. est. are 
studied via simulation. The correction applies to binary predictors because they are 
math, identical to Boolean classifiers. In this context the correction is adapted to the 
coeff. of detn., which has been used to measure nonlinear multivariate relations 
between genes and design genetic regulatory networks. An application using gene- 
expression data from a microarray expt. is provided on the website 
http://gspsnap.tamu.edu/smalls ample/fuser'smallsample^password^smallsample'). 
RE.CNT 6 THERE ARE 6 CITED REFERENCES AVAILABLE FOR THIS RECORD 
ALL CITATIONS AVAILABLE IN THE RE FORMAT 

L6 ANSWER 149 OF 407 CAPLUS COPYRIGHT 2006 ACS on STN 
AN 2003:404307 CAPLUS 
DN 139:144497 

TI Approximate variance-stabilizing transformations for gene-expression microarray 
data 

AU Rocke, David M.; Durbin, Blythe 

CS Department of Applied Science, University of California, Davis, Davis, CA, 95616, 
USA 

SO Bioinformatics (2003), 19(8), 966-972 CODEN: BOINFP; ISSN: 1367-4803 
PB Oxford University Press 
DT Journal 
LA English 

AB A variance stabilizing transformation for 'microarray data was recently introduced 
independently by several research groups. This transformation has sometimes been 
called the generalized logarithm or glog transformation. In this paper, we derive 
several alternative approx. variance stabilizing transformations that may be easier to 
use in some applications. We demonstrate that the started-log and the log-linear- 
hybrid transformation families can produce approx. variance stabilizing transformations 
for microarray data that are nearly as good as the generalized logarithm (glog) 
transformation. These transformations may be more convenient in some applications. 
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AB A review. This article focuses on clustering techniques for the anal, of microarray 
data and discusses contributions and applications for the implementation of intelligent 
diagnostic systems and therapy design studies. Approaches to validating and 
visualizing expression clustering results and software and other relevant resources to 
support clustering-based analyses are reviewed. Finally, this paper addresses current 
limitations and problems that need to be investigated for the development of an 
advanced generation of pattern discovery tools. 
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AB High-throughput cDNA microarray technol. allows for the simultaneous anal, of 
gene expression levels for thousands of genes and as such, rapid, relatively simple 
methods are needed to store, analyze, and cross-compare basic microarray data. The 
application of a classical method of data normalization, Z score transformation, 
provides a way of standardizing data across a wide range of expts. and allows the • 
comparison of microarray data independent of the original hybridization intensities. 
Data normalized by Z score transformation can be used directly in the calcn. of 
significant changes in gene expression between different samples and conditions. We 
used Z scores to compare several different methods for predicting significant changes 
in gene expression including fold changes, Z ratios, Z and t statistical tests. We 



conclude that the Z score transformation normalization method accompanied by either 
Z ratios or Z tests for significance ests. offers a useful method for the basic anal, of 
microarray data. The results provided by these methods can be as rigorous and are no 
more arbitrary than other test methods, and, in addn., they have the advantage that 
they can be easily adapted to stri. spread-sheet programs. 
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AB Provided are a ***computer*** -readable storage medium for 
***microarray*** oligonucleotide probe design. The computer-readable storage 
medium has stored thereon a directory comprising an information on DNA, RNA, 
protein, and/or genome of a target gene and a second directory comprising an 
information on a specific region in the target gene and a third directory contg. an 
information on a probe for identifying the specific region. The first, second, and third 
directories are organized in a hierarchical structure in which the second directory is at a 
level lower than that of the first directory and the third directory is at a level lower than 
that of the second directory. These computer-readable storage media have 
applications in identifying mutations in genes assocd. with disease and may be used in 
diagnosis. 
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AB The use of large-scale microarray expression profiling to identify predictors of 
disease class has become of major interest. Beyond their impact in the din. setting 
(i.e. improving diagnosis and treatment), these markers are also likely to provide clues 
on the mol. mechanisms underlining the diseases. In this paper we describe a new 
method for the identification of multiple gene predictors of disease class. The method 
is applied to the classification of two forms of arthritis that have a similar din. endpoint 
but different underlying mol. mechanisms: rheumatoid arthritis (RA) and osteoarthritis 
(OA). We aim at both the classification of samples and the location of genes 
characterizing the different classes. We achieve both goals simultaneously by 
combining a binary probit model for classification with Bayesian variable selection 
methods to identify important genes. We find very small sets of genes that lead to 
good dassification results. Some of the selected genes are clearly correlated with 
known aspects of the biol. of arthritis and, in some cases, reflect already known 
differences between RA and OA. 
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AB The success of each method of duster anal, depends on how well its underlying 
model describes the patterns of expression. Outlier-resistant and distribution- 
insensitive clustering of genes are robust against violations of model assumptions. A 
measure of dissimilarity that combines advantages of the Eudidean distance and the 
correlation coeff. is introduced. The measure can be made robust using a rank order 
correlation coeff. A robust graphical method of summarizing the results of duster anal, 
and a biol, method of detg. the no. of dusters are also presented. These methods are 
applied to a public data set, showing that rank-based methods perform better than log- 
based methods. Software is available from http://www.davidbickel.com. 
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AB Large arrays of oligonucleotide probes have become popular tools for analyzing 
RNA expression. However to date most oligo collections contain poorly validated 
sequences or are biased toward untranslated regions (UTRs). Here we present a 
strategy for picking oligos for microarrays that focus on a design universe consisting 
exclusively of protein coding regions. We describe the constraints in oligo design that 
are imposed by this strategy, as well as a software tool that allows the strategy to be 
applied broadly. In this work we sequentially apply a variety of simple filters to 
candidate sequences for oligo probes. The primary filter is a rejection of probes that 
contain contiguous identity with any other sequence in the sample universe that 
exceeds a pre-established threshold length. We find that rejection of oligos that 
contain 15 bases of perfect match with other sequences in the design universe is a 
feasible strategy for oligo selection for probe arrays designed to interrogate mammalian 
RNA populations. Filters to remove sequences with low complexity and predicted poor 
probe accessibility narrow the candidate probe space only slightly. Rejection based on 
global sequence alignment is performed as a secondary, rather than primary, test, 
leading to an algorithm that is computationally efficient. Splice isoforms pose unique 
challenges and we find that isoform prevalence will for the most part have to be detd. 
by anal, of the patterns of hybridization of partially redundant oligonucleotides. The 
oligo design program OligoPicker and its source code are freely available at our 
website: seed@molbio.mgh.harvard.edu. 
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AB MARAN is a web-based application for normalizing microarray data. MARAN 
comprises a generic AN OVA model, an option for Loess fitting prior to AN OVA anal., 
and a module for selecting genes with significantly changing expression. 
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AB Motivation: Accurate time series for biol, processes are difficult to est. due to 
problems of synchronization, temporal sampling and rate heterogeneity. Methods are 
needed that can utilize multi-dimensional data, such as those resulting from DNA 
microarray expts., in order to reconstruct time series from unordered or poorly ordered 
sets of observations. Results: We present a set of algorithms for estg. temporal 
orderings from unordered sets of sample elements. The techniques we describe are 
based on modifications of a min. -spanning tree calcd. from a weighted, undirected 
graph. We demonstrate the efficacy of our approach by applying these techniques to 
an artificial data set as well as several gene expression data sets derived from DNA 
microarray expts. In addn. to estg. orderings, the techniques we describe also provide 
useful heuristics for assessing relevant properties of sample datasets such as noise and 
sampling intensity, and we show how a data structure called a PQ-tree can be used to 
represent uncertainty in a reconstructed ordering. Availability: Academic 
implementations of the ordering algorithms are available as source code (in the 
programming language Python) on our web site, along with documentation on their 
use. The artificial 'jelly roll' data set upon which the algorithm was tested is also 
available from this web site. The publicly available gene expression data may be found 
at http://genome-www.stanford.edu/cellcycle/ and 
http://caulobactor.stanford.edu/CellCycle/. 
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AB The microarray technol. allows the high-throughput quantification of the mRNA 
level of thousands of genes under dozens of conditions, generating a wealth of data 
which must be analyzed using some form of computational means. A popular 
framework for such anal, is Matlab, a powerful computing language for which many 
functions have been written. However, although complex topics like neural networks or 
principal component anal, are freely available in Matlab, functions to perform more 
basic tasks like data normalization or hierarchical clustering in an efficient manner are 
not. The MatArray toolbox aims at filling this gap by offering efficient implementations 
of the most needed functions for microarray anal. The functions in the toolbox are 
command-line only, since it is geared toward seasoned Matlab users. 
Http://www.ulb.ac.be/mededne/iribhm/microarray/to olbox. Davenet@ulb.ac.be. 
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AB DNA microarrays are an expti. technol. which consists in arrays of thousands of 
discrete DNA sequences that are printed on glass microscope slides. Image anal, is an 
important aspect of microarray expts. The aim of this step is to reduce an image of 
spots into a table with a measure of the intensity for each spot. Efficient, accurate and 
automatic anal, of DNA spot images is essential in order to use this technol. in lab. 
routines. We present an automatic non-supervised set of algorithms for a fast and 
accurate spot data extn. from DNA microarrays using morphol. operators which are 
robust to both intensity variation and artifacts. The approach can be summarized as 
follows. Initially, a gridding algorithm yields the automatic segmentation of the 
microarray image into spot quadrants which are later individually analyzed. Then the 
anal, of the spot quadrant images is achieved in five steps. First, a prequantification, 
the spot size distribution law is calcd. Second, the background noise extn. is performed 
using a morphol. filtering by area. Third, an orthogonal grid provides the first approach 
to the spot locus. Fourth, the spot segmentation or spot boundaries definition is carried 
out using the watershed transformation. And fifth, the outline of detected spots allows 
the signal quantification or spot intensities extn.; in this respect, a noise model has 
been investigated. The performance of the algorithm has been compared with two 
packages: ScanAlyze and Genepix, showing its robustness and precision. A prototype 
system integrated in PDI32 (an image processing software for Windows) may be 
obtained from the authors on request. 
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AB An overview and introduction to some of the software packages developed on the 
Brown and Botstein labs, at Stanford University for the visual display of microarray data 
are provided. A new tool DecCor2, designed to allow the genome-order display of 
aggregate microarray data, is described. The software packages briefly described 
include TreeView and Ouster, Promoter, and Caryoscope. 
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AB This paper compares the type I error and power of the one- and two-sample t- 
tests, and the one- and two-sample permutation tests for detecting differences in gene 
expression between two ***microarray*** samples with replicates using Monte 
Carlo ***simulations*** . When data are generated from a normal distribution, type 
I errors and powers of the one-sample parametric t-test and one-sample permutation 
test are very dose, as are the two-sample t-test and two-sample permutation test, 
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provided that the no. of replicates is adequate. When data are generated from a t- 
distribution, the permutation tests outperform the corresponding parametric tests if the 
no. of replicates is at least five. For data from a two-color dye swap expt., the one- 
sample test appears to perform better than the two-sample test since expression 
measurements for control and treatment samples from the same spot are correlated. 
For data from independent samples, such as the one-channel array or two-channel 
array expt. using ref. design, the two-sample t-tests appear more powerful than the 
one-sample t-tests. 
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AB In this work we have developed a new framework for microarray gene expression 
data anal. This framework is based on hidden Markov models. We have bench marked 
the performance of this probability model-based clustering algorithm on several gene 
expression datasets for which external evaluation criteria were available. The results 
showed that this approach could produce clusters of quality comparable to two 
prevalent clustering algorithms, but with the major advantage of detg. the no. of 
clusters. We have also applied this algorithm to analyze published data of yeast cell 
cycle gene expression and found it able to successfully dig out biol. meaningful gene 
groups. In addn., this algorithm can also find correlation between different functional 
groups and distinguish between function genes and regulation genes, which is helpful 
to construct a network describing particular biol. assocns. Currently, this method is 
limited to time series data. 

RE.CNT 28 THERE ARE 28 CITED REFERENCES AVAILABLE FOR THIS RECORD 
ALL CITATIONS AVAILABLE IN THE RE FORMAT 

L6 ANSWER 163 OF 407 CAPLUS COPYRIGHT 2006 ACS on STN 
AN 2003:323808 CAPLUS 
DN 138:286967 

TI Construction of computer network by comparison of the gene expression profile 
IN Wang, Renli 

PA Liang, Gangyu, Peop. Rep. China 

SO Faming Zhuanli Shenqing Gongkai Shuomingshu, 9 pp. CODEN: CNXXEV 
DT Patent 
LA Chinese 

FAN.CNT 1 PATENT NO. KIND DATE APPLICATION NO. DATE 



PI CN 1342775 A 20020403 CN 2000-126356 20000911 

PRAI CN 2000-126356 20000911 

AB The invention provides a method of constructing of computer network based on 
comparison of individual gene expression profile to that from a specific individual such 
as entertainer. The process consists of taking sample from individual, hybridization of 
the sample on a gene chip from the specific individual, data anal., and storing the data 
to a database for creation of network. The computer network for connection of 
different human group can be used for com. and individual uses. 
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AB Global analyses of RNA expression levels are useful for classifying genes and 
overall phenotypes. Often these classification problems are linked, and one wants to 
find "marker genes" that are differentially expressed in particular sets of "conditions.". 
We have developed a method that simultaneously dusters genes and conditions, 
finding distinctive "checkerboard" patterns in matrixes of gene expression data, if they 
exist. In a cancer context, these checkerboards correspond to genes that are markedly 
up* or downregulated in patients with particular types of tumors. Our method, spectral 
biclustering, is based on the observation that checkerboard structures in matrixes of 
expression data can be found in eigenvectors corresponding to characteristic 
expression patterns across genes or conditions. In addn., these eigenvectors can be 
readily Identified by commonly used linear algebra approaches, in particular the 
singular value decompn. (SVD), coupled with closely integrated normalization steps. 
We present a no. of variants of the approach, depending on whether the normalization 
over genes and conditions is done independently or in a coupled fashion. We then 
apply spectral biclustering to a selection of publicly available cancer expression data 
sets, and examine the degree to which the approach is able to identify checkerboard 
structures. Furthermore, we compare the performance of our biclustering methods 
against a no. of reasonable benchmarks (e.g., direct application of SVD or normalized 
cuts to raw data). 
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synthetic DNA ***microarrays*** 

IN Zuzan, Harry; Johnson, Valen E. 

PA Duke University, USA 
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PI WO 2003034064 A2 20030424 WO 2002-US31281 20020930 W: 
AE, AG, AL, AM, AT, AU, AZ, BA, BB, BG, BR, BY, BZ, CA, CH, CN, CO, CR, CU, 
CZ, DE, DK, DM, DZ, EC, EE, ES, FI, GB, GD, GE, GH, GM, HR, HU, ID, IL, IN, IS, 
JP, KE, KG, KP, KR, KZ, LC, LK, LR, LS, LT, LU, LV, MA, MD, MG, MK, MN, MW, 
MX, MZ, NO, NZ, OM, PH, PL, PT, RO, RU, SD, SE, SG, SI, SK, SL, TJ, TM, TN, TR, 
TT, TZ, UA, UG, US, UZ, VC, VN, YU, ZA, ZM, ZW RW: GH, GM, KE, LS, MW, 
MZ, SD, SL, SZ, TZ, UG, ZM, ZW, AM, AZ, BY, KG, KZ, MD, RU, TJ, TM, AT, BE, 
BG, CH, CY, CZ, DE, DK, EE, ES, FI, FR, GB, GR, IE, IT, LU, MC, NL, PT, SE, SK, 
TR, BF, BJ, CF, CG, CI, CM, GA, GN, GQ, GW, ML, MR, NE, SN, TD, TG AU 
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AB Methods, systems, and ***computer*** program products for analyzing 
images of high d. ***microarray*** chips analyze the image by estg. background 
using a blurring kernel and/or a spatial multivariate statistical model of the background. 
The methods, systems, and computer program products can employ a multivariate 
statistical model and/or a blurring kernel to obtain more representative hybridization 
intensity results, particularly for pixels in boundary regions of the probe cells. The 
methods allow for alternative microarray configurations of nucleic acid probes and do 
not require the use of mismatch probes and can be independent of the type of 
nucleotide sequence used. Assocd. microarrays and systems are also described. 
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AB A review. The software iOmega was presented for microarray-based detection 
and genotyping of hepatitis C viruses (HCV). Chip design and prepn., hybridization and 
evaluation, specificity of the chip design, and automated data interpretation of the HCV 
genotype 3 were described. Intelligent combinatorial anal, in probe design and 
evaluation were discussed. 
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AB Methods, computer software and systems are provided for biol. data anal. In one 
embodiment, a probe logarithmic intensity error resolver is provided to analyze gene 
expression data obtained using multiprobes. 
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AB Microfluidic cassettes that perform integrated biol. sample prepn. and DNA anal, 
require fluidic control and transport mechanisms built into the device. In this study, 
pneumatically actuated diaphragm pumps and valves were employed to achieve precise 
fluidic manipulation and enabled the execution of several sample-processing steps 
within a single cassette. However, the design of the microfluidic cassette to accomplish 
this multi-step fluidic protocol required a complex three-dimensional fluid path through 
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valves, bends, various sized passageways and a porous filter for cell capture. In order 
to understand the fluidic behavior in such a device, measurements were taken of the 
pneumatic pressure delivered to the diaphragm pump as it pushed sample through the 
complicated fluidic pathway. Simultaneously monitored were the resulting volumetric 
flow rate, and the corresponding pre- and post-filter fluid pressures. The data enabled 
the construction of a model that simulated the fluidic action through the device using 
established fluid mechanics theory that closely matched flow rate and pressure data. 
The ability to simulate the behavior of diaphragm pumping and resulting fluidic 
movements in complex microfluidic devices provides a greater comprehension of this 
phenomenon and a useful tool in the application to future devices for biochem. anal. 
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AB PGAGENE is a web-based gene-specific genomic data search engine, which allows 
users to search over 5.9 million pieces of collective genetic and genomic data from the 
NHLBI supported Programs for Genomic Applications. This data includes microarray 
measurements, SNPs, and mutations, and data may be found using symbols, parts of 
gene names or products, Affymetrix probe IDs, GenBank accession nos., UniGene IDs, 
dbSNP IDs, and others. The PGAGENE indexing agent periodically maps all publicly 
available gene-specific PGA data onto LocusLink using dynamically generated cross- 
referencing tables. 
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AB The candidate genes of multifactorial diseases were detected by analyzing quant, 
whole-genome gene expression data (DNA microarrays) using the Bayesian network 
method. Two Bayesian network models were considered, namely, one that consists of 
a continuous node and a discrete node, and another that addnl. includes relationships 
between continuous nodes. In these models, a discrete node represents extrinsic 
factors, and the effect of such factors in the network context is estd. The difference of 
the no. of significant nodes with various values of the no. of parent nodes was shown. 
Significant hits without network context were largely reduced as parent nodes are 
included. In contrast, some genes are turned out to become significant, when 
considering the network context. 
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AB KnowledgeEditor is a software that can be used to import a probe information 
from microarray data to a biomol. network and to modify a known metabolic pathway 
based on novel microarray expts. The drawn network on KnowledgeEditor can be 
exported in XML format, which is suitable to organize and share data among scientists. 
It also enables users to publish the modeled network on the web with the plug-in 
module GSCope Viewer. 
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AB The chapter describes the basic steps comprising a microarray data mining 
process. The European Union-funded Cross-Industry Std. Process for Data Mining 
(CRISP-DM) is used as the framework for an example of working through the anal, or 
mining of microarray data. CRISP-DM methodol. is detailed, complete, publicly 
accessible, and actively supported by various Data Mining software vendors. 
Establishing a framework for Data Mining that includes flexible boundaries and change, 
planning for testing of new algorithms, and tightening or loosening model success 
criteria will be important for the near future. 
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AB A review on various microarray software categorized by their purposes and 
characteristics. These include primer/probe design, image anal., data mining, statistics, 
pathway reconstruction, and annotation software. 
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AB A review. The class comparison and prediction methods for microarray data, with 
an example from Vanderbilt lung cancer SPORE is discussed. These methods include 
the mutual information scoring (Info Score), weighted gene anal. (WGA), significance 
anal, of microarrays (SAM), and permutation t-test or F-test for identifying genes that 
are differentially expressed between different classes. The statistical class comparison 
and class prediction analyses for the microarray data may focus on the following steps: 
(1) Selecting the important gene patterns that perform differently among the study 
groups, (2) Using the dass prediction model based upon the Weighted Flexible Compd. 
Covariate Method (WFCCM), classification tree methods, or other methods to verify if 
the genes selected in step one have the statistical significant prediction power on the 
training samples, (3) Applying the prediction model generated from step two to a set of 
test samples for examg. the prediction power on the test samples, and (4) Employing 
the agglomerabve hierarchical clustering algorithm to investigate the pattern among 
the significant discriminator genes as well as the biol. status. 
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AB Statistical techniques for normalization of microarray data are discussed. These 
techniques allow the user to ext. as much of the biol. signal from a microarray expt. as 
possible. Normalization methods that can be applied to both oligo- and spotted-array 
data are discussed. 

RE.CNT 25 THERE ARE 25 CITED REFERENCES AVAILABLE FOR THIS RECORD 
ALL CITATIONS AVAILABLE IN THE RE FORMAT 

L6 ANSWER176OF407 CAPLUS COPYRIGHT 2006 ACS on STN 
AN 2003:244453 CAPLUS 
DN 138:380285 

TI Senescence gene expression-specific gene expression fingerprints reveal cell-type- 
dependent physical clustering of up-regulated chromosomal loci 
AU Zhang, Hong; Pan, Kuang-Hung; Cohen, Stanley N. 



Page 27 of 63 



Serial No. 10/501,848 
STN SEARCH - a 



CS Department of Genetics, Stanford University School of Medicine, Stanford, CA, 
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AB Replicative senescence is the state of irreversible proliferative arrest that occurs as 
a concomitant of progressive telomere shortening. By using cDNA ***microarrays*** 
and the GABRIEL system of ***computer*** programs to apply domain-specific and 
procedural knowledge for data anal., the authors investigated global changes in gene 
transcription occurring during replicative senescence in human fibroblasts and 
mammary epithelial cells (HMECs). Here the authors report the identification of 
transcriptional "fingerprints" unique to senescence, the finding that gene expression 
perturbations during senescence differ greatly in fibroblasts and HMECs, and the 
discovery that despite the disparate nature of the chromosomal loci affected by 
senescence in fibroblasts and HMECs, the up-regulated loci in both types of cells show 
phys. clustering. This clustering, which contrasts with the random distribution of genes 
down- regulated during senescence or up-regulated during reversible proliferative arrest 
(i.e., quiescence), supports the view that replicabve senescence is assocd. with 
alteration of chromatin structure. 
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AB AVA (Array Visual Analyzer) is a Java program that provides a graphical 
environment for visualization and anal, of gene expression microarray data. Together 
with its interactive visualization tools and a variety of built-in data anal, and filtration 
methods, AVA effectively integrates microarray data normalization, quality assessment, 
and data mining into one application. The software is freely available for academic 
users on request from the authors. 
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AB QuickLIMS is a Microsoft Access-based lab. information and management system 
capable of processing all information for microarray prodn. The program's operational 
flow is protocol-based, dynamically adapting to changes of the process. It interacts 
with the lab. robot and with other database systems over the network, and it 
represents a complete soln. for the management of the entire manufg. process. 
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AB DNA and protein microarrays have become an established leading-edge technol. 
for large-scale anal, of gene and protein content and activity. The use of contact- 
printed microarrays has emerged as a relatively simple and cost effective method of 
choice, but its reliability is esp. susceptible to the quality of pixel information obtained 
from digital scans of spotted features in the microarray image. We address the 
statistical computation requirements for optimizing data acquisition and processing of 
digital scans. We consider the use of median filters to reduce noise levels in images 
and top-hat filters to correct for trends in background values. We also consider, as 
alternative estimators of spot intensity, disks of fixed radius, proportions of histograms 
and k- means clustering, either with or without a square-root intensity transformation 
and background subtraction. We identify, using combinatory procedures, optimal filter 
and estimator parameters, in achieving consistency among the replicates of a gene on 
each microarray. Our results, using test data from microarrays of HCMV, indicate that 
a highly effective approach for improving reliability and quality of microarray data is to 
apply a 21 by 21 top-hat filter, then est. spot intensity as the mean of the largest 20% 
of pixel values in the target region, after a square-root transformation, and cor. for 
background, by subtracting the mean of the smalJest 70% of pixel values. Fortran90 



subroutines implementing these methods are available from the authors, or at 
http://www.bioss.ac.Uk/.apprx.chris. Contact: chris@bioss.ac.uk. 
RE.CNT 21 THERE ARE 21 CITED REFERENCES AVAILABLE FOR THIS RECORD 
ALL CITATIONS AVAILABLE IN THE RE FORMAT 

L6 ANSWER 180 OF 407 CAPLUS COPYRIGHT 2006 ACS on STN 
AN 2003:228683 CAPLUS 
DN 138:363393 

TI Summaries of Affymetrix GeneChip probe level data 

AU Irizarry, Rafael A.; Bolstad, Benjamin M.; Collin, Francois; Cope, Leslie M.; Hobbs, 
Bridget; Speed, Terence P. 

CS Department of Biostatistics, Johns Hopkins University, Baltimore, MD, 21205, USA 
SO Nucleic Acids Research (2003), 31(4), el5/l-el5/8 CODEN: NARHAD; ISSN: 0305- 
1048 

PB Oxford University Press 
DT Journal 
LA English 

AB High d. oligonucleotide array technol. is widely used in many areas of biomedical 
research for quant, and highly parallel measurements of gene expression. Affymetrix 
GeneChip arrays are the most popular. In this technol. each gene is typically 
represented by a set of 11-20 pairs of probes. In order to obtain expression measures 
it is necessary to summarize the probe level data. Using two extensive spike-in studies 
and a diln. study, we developed a set of tools for assessing the effectiveness of 
expression measures. We found that the performance of the current version of the 
default expression measure provided by Affymetrix Microarray Suite can be significantly 
improved by the use of probe level summaries derived from empirically motivated 
statistical models. In particular, improvements in the ability to detect differentially 
expressed genes are demonstrated. 

RE.CNT 18 THERE ARE 18 CITED REFERENCES AVAILABLE FOR THIS RECORD 
ALL CITATIONS AVAILABLE IN THE RE FORMAT 

L6 ANSWER 181 OF 407 CAPLUS COPYRIGHT 2006 ACS on STN 
AN 2003:228413 CAPLUS 
DN 138:363391 

TI Comparisons and validation of statistical clustering techniques for microarray gene 
expression data 

AU Datta, Susmita; Datta, Somnath 

CS Department of Mathematics and Statistics and Department of Biology, Georgia 
State University, Atlanta, GA, 30303, USA 

SO Bioinformatics (2003), 19(4), 459-466 CODEN: BOINFP; ISSN: 1367-4803 
PB Oxford University Press 
DT Journal 
LA English 

AB With the advent of microarray chip technol., large data sets are emerging contg. 
the simultaneous expression levels of thousands of genes at various time points during 
a biol. process. Biologists are attempting to group genes based on the temporal 
pattern of their expression levels. While the use of hierarchical clustering (UPGMA) 
with correlation distance' has been the most common in the microarray studies, there 
are many more choices of clustering algorithms in pattern recognition and statistics 
literature. At the moment there do not seem to be any clear-cut guidelines regarding 
the choice of a clustering algorithm to be used for grouping genes based on their 
expression profiles. In this paper, we consider six clustering algorithms (of various 
flavors!) and evaluate their performances on a well-known publicly available 
***microarray*** data set on sporulation of budding yeast and on two 
***simulated*** data sets. Among other things, we formulate three reasonable 
validation strategies that can be used with any clustering algorithm when temporal 
observations or replications are present. We evaluate each of these six clustering 
methods with these validation measures. While the best' method is dependent on the 
exact validation strategy and the no. of clusters to be used, overall Diana appears to be 
a solid performer. Interestingly, the performance of correlation-based hierarchical 
clustering and model-based clustering (another method that has been advocated by a 
no. of researchers) appear to be on opposite extremes, depending on what validation 
measure one employs. Next it is shown that the group means produced by Diana are 
the dosest and those produced by UPGMA are the farthest from a model profile based 
on a set of hand-picked genes. 
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AB We have analyzed microarray data using a modeling approach based on the 
multivariate statistical method partial least squares (PLS) regression to identify genes 
with periodic fluctuations in expression levels coupled to the cell cycle in the budding 
yeast, Saccharomyces cerevisiae. PLS has major advantages for analyzing microarray 
data since it can model data sets with large nos. of variables and with few 
observations. A response model was derived describing the expression profile over 
time expected for periodically transcribed genes, and was used to identify budding 
yeast transcripts with similar profiles. PLS was then used to interpret the importance of 
the variables (genes) for the model, yielding a ranking list of how well the genes fitted 
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the generated model. Application of an appropriate cutoff value, calcd. from 
randomized data, allows the identification of genes whose expression appears to be 
synchronized with cell cycling. Our approach also provides information about the stage 
in the cell cycle where their transcription peaks. Three synchronized yeast cell 
microarray data sets were analyzed, both sep. and combined. Cell cycle-coupled 
periodicity was suggested for 455 of the 6,178 transcripts monitored in the combined 
data set, at a significance level of 0.5%. Among the candidates, 85% of the known 
periodic transcripts were included. Anal, of the three data sets sep. yielded similar 
ranking lists, showing that the method is robust. 
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AB Motivation: DNA microarrays have recently been used for the purpose of 
monitoring expression levels of thousands of genes simultaneously and identifying 
those genes that are differentially expressed. The probability that a false identification 
(type I error) is committed can increase sharply when the no. of tested genes gets 
large. Correlation between the test statistics attributed to gene co-regulation and 
dependency in the measurement errors of the gene expression levels further 
complicates the problem. In this paper the authors address this very large multiplicity 
problem by adopting the false discovery rate (FDR) controlling approach. To address 
the dependency problem, the authors present three resampling-based FDR controlling 
procedures, that account for the test statistics distribution, and compare their 
performance to that of the naive application of the linear step-up procedure in 
Benjamini and Hochberg (1995). The procedures are studied using ***simulated*** 
***microarray*** data, and their performance is examd. relative to their ease of 
implementation. Results: Comparative simulation anal, shows that all four FDR 
controlling procedures control the FDR at the desired level, and retain substantially 
more power then the family -wise error rate controlling procedures. In terms of power, 
using resampling of the marginal distribution of each test statistics substantially 
improves the performance over the naive one. The highest power is achieved, at the 
expense of a more sophisticated algorithm, by the resampling-based procedures that 
resample the joint distribution of the test statistics and est. the level of FDR control. 
Availability: An R program that adjusts p-values using FDR controlling procedures is 
freely available over the Internet at www.math.tau.ac.il/ .apprx.ybenja. 
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AB A review, with refs. DNA microarray assays represent the first widely used 
application that attempts to build upon the information provided by genome projects in 
the study of biol. questions. One of the greatest challenges with working with 
microarrays is collecting, managing, and analyzing data. Although several com. and 
noncommercial solns. exist, there is a growing body of freely available, open source 
software that allows users to analyze data using a host of existing techniques and to 
develop their own and integrate them within the system. Here we review three of the 
most widely used and comprehensive systems, the statistical anal, tools written in R 
through the Bioconductor project (http://www.bioconductor.org), the Java-based TM4 
software system available from The Institute for Genomic Research 
(http://www.tigr.org/software), and BASE, the Web-based system developed at Lund 
University (http://base.thep.lu.se). 
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TI Microarray-based cancer diagnosis with artificial neural networks 
AU Ringner, Markus; Peterson, Carsten 

CS Complex Systems Division, Department of Theoretical Physics, Lund University, 
Swed. 

SO BioTechniques (2003), (Suppl.), 30-32,34-35 CODEN: BTNQDO; ISSN: 0736-6205 
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DT Journal; General Review 
LA English 

AB A review, with refs. In recent years, the advent of expti. methods to probe gene 
expression profiles of cancer on a genome-wide scale has led to widespread use of 
supervised machine learning algorithms to characterize these profiles. The main 
applications of these anal, methods range from assigning functional classes of 



previously uncharacterized genes to classification and prediction of different cancer 
tissues. This article surveys the application of machine learning algorithms to 
classification and diagnosis of cancer based on expression profiles. To exemplify the 
important issues of the classification procedure, the emphasis of this article is on one 
such method, namely artificial neural networks. In addn., methods to ext. genes that 
are important for the performance of a classifier, as well as the influence of sample 
selection on prediction results are discussed. 
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PB IOS Press 
DT Journal 
LA English 

AB We wished to quantify the state-of-the-art of our understanding of clusters in 
microarray data. To do this we systematically compared the clusters produced on sets 
of microarray data using a representative set of clustering algorithms (hierarchical, k- 
means, and a modified version of QT_CLUST) with the annotation schemes MIPS, 
GeneOntol. and GenProtEC. We assumed that if a cluster reflected known biol. its 
members would share related ontol. annotations. This assumption is the basis of 
"guilt-by-assocn." and is commonly used to assign the putative function of proteins. To 
statistically measure the relationship between cluster and annotation we developed a 
new predictive discriminatory measure. We found that the clusters found in microarray 
data do not in general agree with functional annotation classes. Although many 
statistically significant relationships can be found, the majority of clusters are not 
related to known biol. (as described in annotation ontologies). This implies that use of 
guilt-by-assocn. is not supported by annotation ontologies. Depending on the est. of 
the amt. of noise in the data, our results suggest that bioinformatics has only codified a 
small proportion of the biol. knowledge required to understand microarray data. The 
annotated clusters can be found at 
http://www.aber.ac.uk/compsci/Research/bio/dss/gb a/. 
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TI Incorporation of DNA chip technology to the simulation and validation of flux 
analysis in yeast diauxic growth 
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DT Journal 
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AB We incorporated gene expression information from cDNA ***microarray*** 
into flux anal, to ***simulate*** yeast diauxic growth. Expression ratios of both 
growth phases were applied to assign the split ratio at glyoxylate shunt during 
simulation, in which the equation was math, unsolvable due to the singularity and 
artificial split ratios, which were traditionally introduced without biol. evidence. In 
addn., the directionality of *** microarray*** dataset was used as a further 
constraint during ***simulation*** . Metabolic fluxes obtained by this modified 
approach are in general consistent with microarray anal. However, discrepancies 
occurred when the quantity of fluxes was compared, probably due to the substantial 
redn. of substrates at phase II in which the increase in the enzymic levels was not 
proportional to the increase of substrate flow, as would be predicted from microarray 
dataset. The modified flux anal, might have brought a new approach to investigate 
other cellular pathways. 
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TI Photonic modeling of DNA chips 
AU Getin, Stephane 
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DT Journal 
LA French 

AB Methods of modeling the patterns of distribution of emissions from fluorescent 
reporter groups in biochips are developed. The spatial distribution of fluorescence 
emissions from an element on a hybridization microarray is not necessarily uniform. 
Modeling of the emission patterns can be used to optimize scanning and uniformity of 
data collection and avoid artifacts in anal. 
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TI Simulation of biological systems 

AU Gidrol, Xavier 

CS Genopole, Evry, Fr. 
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0298-6248 
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DT Journal; General Review 
LA French 

AB A review. Presented is the use of DNA microarray technol. and gene expression 
profiling, in conjunction with modeling of biol. systems for evaluation of macromol. 
function and cellular response to disease status, such as cancer. 

L6 ANSWER 190 OF 407 CAPLUS COPYRIGHT 2006 ACS on STN 
AN 2003:197017 CAPLUS 
DN 138:164190 
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AU Spellman, Paul T.; Miller, Michael; Stewart, Jason; Troup, Charles; Sarkans, Ugis; 
Chervitz, Steve; Bernhart, Derek; Sherlock, Gavin; Ball, Catherine; Lepage, Marc; 
Swiatek, Marcin; Marks, W. L.; Goncalves, Jason; Markel, Scott; Iordan, Daniel; 
Shojatalab, Mohammadreza; Pizarro, Angel; White, Joe; Hubley, Robert; Deutsch, Eric; 
Senger, Martin; Aronow, Bruce J.; Robinson, Alan; Bassett, Doug; Stoeckert, Christian 
J., Jr.; Brazma, Alvis 

CS Department of Cell and Molecular Biology, University of California at Berkeley, 
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2002-3-9-research0046.pdf 

PB BioMed Central Ltd. 
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AB Meaningful exchange of microarray data is currently difficult because it is rare that 
published data provide sufficient information depth or are even in the same format 
from one publication to another. Only when data can be easily exchanged will the 
entire biol. community be able to derive the full benefit from such microarray studies. 
To this end we have developed three key ingredients towards standardizing the storage 
and exchange of microarray data. First, we have created a minimal information for the 
annotation of a microarray expt. (MIAME)-compliant conceptualization of microarray 
expts. modeled using the unified modeling language (UML) named MAGE-OM 
(microarray gene expression object model). Second, we have translated MAGE-OM into 
an XML-based data format, MAGE-ML, to facilitate the exchange of data. Third, some 
of us are now using MAGE (or its progenitors) in data prodn. settings. Finally, we have 
developed a freely available software tool kit (MAGE-STK) that eases the integration of 
MAGE-ML into end user's systems. MAGE will help microarray data producers and 
users to exchange information by providing a common platform for data exchange, and 
MAGE-STK will make the adoption of MAGE easier. 
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AB A review with 29 refs. Comprehensive microarrays covering large nos, of the 
predicted expressed transcripts for some invertebrates and vertebrates have been 
available for some time. Despite predictions that this technol. will transform biol., to 
date there have been few published studies using microarrays to generate novel 
insights in developmental biol. 
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TI CATMA: a complete arabidopsis GST database 

AU Crowe, Mark L; Serizet, Carine; Thareau, Vincent; Aubourg, Sebastien; Rouze, 
Pierre; Hilson, Pierre; Beynon, Jim; Weisbeek, Peter; van Hummelen, Paul; Reymond, 
Philippe; Paz-Ares, Javier; Nietfeld, Wilfried; Trick, Martin 
CS Laboratoire associe de I'INRA, Fr. 

SO Nucleic Acids Research (2003), 31(1), 156-158 CODEN: NARHAD; ISSN: 0305- 
1048 

PB Oxford University Press 
DT Journal 
LA English 

AB The Complete Arabidopsis Transcriptome Microarray (CATMA) database contains 
gene sequence tags (GST) and gene model sequences for over 70% of the predicted 
genes in the Arabidopsis thaliana genome as well as primer sequences for GST 
amplification and a wide range of supplementary information. All CATMA GST 
sequences are specific to the gene for which they were designed, and all gene models 



were predicted from a complete reannotatjon of the genome using uniform parameters. 
The database is searchable by sequence name, sequence homol. or direct SQL query. 
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TI The Stanford Microarray Database: data access and quality assessment tools 
AU Gollub, Jeremy; Ball, Catherine A.; Binkley, Gail; Demeter, Janos; Finkelstein, 
David B.; Hebert, Joan M.; Hernandez-Boussard, Tina; Jin, Heng; Kaloper, Miroslava; 
Matese, John C; Schroeder, Mark; Brown, Patrick O.; Botstein, David; Sherlock, Gavin 
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AB The Stanford Microarray Database (SMD) serves as a microarray research 
database for Stanford investigators and their collaborators. In addn., SMD functions as 
a resource for the entire scientific community, by making freely available all of its 
source code and providing full public access to data published by SMD users, along 
with many tools to explore and analyze those data. SMD currently provides public 
access to data from 3,500 microarrays, including data from 85 publications, and this 
total is increasing rapidly. Some of the SMD's newer tools for accessing public data, 
assessing data quality and for data anal, are described. 
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TI NetAffx: Affymetrix probesets and annotations 
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AB NetAffx details and annotates probesets on Affymetrix GeneChip microarrays. 
These annotations include: static information specific to the probeset compn.; 
sequence annotations extd. from public databases; and protein sequence-level 
annotations derived from public domain programs as well as libraries of hidden Markov 
models (HMMs) developed by Affymetrix. For each probeset, NetaAffx lists the probe 
sequences, and the consensus sequence interrogated by the probes; for the larger chip 
sets, interactive maps display this sequence data in genomic context. Sequence 
annotations include gene onto!. (GO) terms and depiction of GO graph relationships; 
predicted protein domains and motifs; orthologous sequences; links to relevant 
pathways; and links to public databases including UniGene, LocusLink, SWISS-PROT 
and OMIM. 
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AB ArrayExpress is a new public database of microarray gene expression data at the 
EBI, which is a generic gene expression database designed to hold data from all 
microarray platforms. ArrayExpress uses the annotation std. Min. Information About a 
Microarray Expt. (MIAME) and the assocd. XML data exchange format Microarray Gene 
Expression Markup Language (MAGE-ML) and it is designed to store well annotated 
data in a structured way. The ArrayExpress infrastructure consists of the database 
itself, data submissions in MAGE-ML format or via an online submission tool 
MIAM Express, online database query interface, and the Expression Profiler online anal, 
tool. ArrayExpress accepts three types of submission, arrays, expts. and protocols, 
each of these is assigned an accession no. Help on data submission and annotation is 
provided by the curation team. The database can be queried on parameters such as 
author, lab., organism, expt. or array types. With an increasing no. of organisations 
adopting MAGE-ML std., the vol. of submissions to ArrayExpress is increasing rapidly. 
The database can be accessed at http://www.ebi.ac.uk/arrayexpres s. 
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PI US 2003044320 Al 20030306 US 2001-943937 20010831 US 
2003044808 Al 20030306 US 2001-20025 20011207 WO 2003018772 
A2 20030306 WO 2002-US27971 20020903 WO 2003018772 A3 
20030417 W: AE, AG, AL, AM, AT, AU, AZ, BA, BB, BG, BR, BY, BZ, CA, CH, CN, 
CO, CR, CU, CZ, DE, DK, DM, DZ, EC, EE, ES, FI, GB, GD, GE, GH, GM, HR, HU, 
ID, IL, IN, IS, JP, KE, KG, KP, KR, KZ, LC, LK, LR, LS, LT, LU, LV, MA, MD, MG, 
MK, MN, MW, MX, MZ, NO, NZ, OM, PH, PL, PT, RO, RU, SD, SE, SG, SI, SK, SL, 
TJ, TM, TN, TR, TT, TZ, UA, UG, UZ, VC, VN, YU, ZA, ZM, ZW RW: GH, GM, 
KE, LS, MW, MZ, SD, SL, SZ, TZ, UG, ZM, ZW, AM, AZ, BY, KG, KZ, MD, RU, TJ, 
TM, AT, BE, BG, CH, CY, CZ, DE, DK, EE, ES, FI, FR, GB, GR, IE, IT, LU, MC, NL, 
PT, SE, SK, TR, BF, BJ, CF, CG, CI, CM, GA, GN, GQ, GW, ML, MR, NE, SN, TD, 
TG 
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AB The present invention discloses platform techno!, which integrates current DNA 
micro array technol. and current high throughput screening technol. The invention 
contains three major components: an array gridding head, the hybrid glass chip/micro 
titer plate format plate that contains the micro arrays produced by the 
a rraying/g ridding head, and an array scanner with data acquisition and anal, software. 
The arraying/gridding head is capable of simultaneously depositing DNA, RNA 
peptidal nucleic acid (PNA), or polypeptide (protein) solns., etc. onto chem. treated 
modified surfaces in 96, 384 and 1536 well formats of repeating patterns on the 
modified glass chips/plates. The micro arrays are composed of arrays of 96, 384 or 
1536 patterns with defined specifications on the single glass "chip" packaged as a std. 
micro titer plate conforming to the Society of Biomol. Screening (SBS) specification for 
robotic handling. The array reading and anal, component includes an array scanning 
device and anal, software. The array scanner is configured to read micro arrays in the 
micro titer plate format of the invention as well as current microscope slide format. 
Thus, the invention transforms current DNA micro array technol. into a high throughput 
screening tool. 
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TI MAPPFinder: using Gene Ontology and GenMAPP to create a global gene- 
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AB MAPPFinder is a tool that creates a global gene-expression profile across all areas 
of biol. by integrating the annotations of the Gene Ontol. (GO) Project with the free 
software package GenMAPP (http://www.GenMAPP.org). The results are displayed in a 
searchable browser, allowing the user to rapidly identify GO terms with over- 
represented nos. of gene expression changes. Clicking on GO terms generates 
GenMAPP graphical files where gene relationships can be explored, annotated, and files 
can be freely exchanged. 
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AB The invention relates to a computer-based system for supporting DNA sequence 
anal. Data obtained from a homol. search carried out with DNA chips or protein chips 
are complied in databases. A search using individual probe sequences are carried out 
against this secondary database. The system includes software and worldwide web 
internet server. 
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AB A microarray processing app. for a substance to be processed (e.g., nucleic acid) 
is provided, with which the operation for processing a substance to be processed is 
easy, and the labor and time required for the operation is reduced. The processing 
app. for a substance to be processed (microarray processing app.) comprises an app. 
main body equipped with two sealable processing tanks (reaction tanks), a personal 
computer, and four processing units in total accommodated in the resp. processing 
tank in such a way that they are freely mounted or detached. The app. main body is 
equipped with a horizontal stage and a vertical stage. On the horizontal stage, 
installed are a chip-mounting part, a probe soln. -accommodating container-mounting 
part, a container-mounting part, and the resp. processing tank. On the vertical stage, 
installed is a probe soln. -supplying means (reaction liq. -supplying means). In addn., a 
supply circuit, a discharge circuit, and a temp. -regulating means are mainly installed 
inside the app. main body. Diagrams describing the app. assembly are given. 
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AB A microarray processing app. for a substance to be processed (e.g., nucleic acid) 
is provided, with which the operation for processing a substance to be processed is 
easy, and the labor and time required for the operation is reduced. The processing 
app. for a substance to be processed (microarray processing app.) comprises an app. 
main body equipped with two sealable processing tanks (reaction tanks), a personal 
computer, and four processing units in total accommodated in the resp. processing 
tank in such a way that they are freely mounted or detached. The app. main body is 
equipped with a horizontal stage and a vertical stage. On the horizontal stage, 
installed are a chip-mounting part, a probe soln. -accommodating container-mounting 
part, a container-mounting part, and the resp. processing tank. On the vertical stage, 
installed is a probe soln. -supplying means (reaction liq .-supplying means). In addn., a 
supply circuit, a discharge circuit, and a temp.-regulating means are mainly installed 
inside the app. main body. Diagrams describing the app. assembly are given. 
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AB A microarray processing app. for a substance to be processed (e.g., nucleic acid) 
is provided, with which the operation for processing a substance to be processed is 
easy, and the labor and time required for the operation is reduced. The processing 
app. for a substance to be processed (microarray processing app.) comprises an app. 
main body equipped with two sealable processing tanks (reaction tanks), a personal 
computer, and four processing units in total accommodated in the resp. processing 
tank in such a way that they are freely mounted or detached. The app. main body is 
equipped with a horizontal stage and a vertical stage. On the horizontal stage, 
installed are a chip-mounting part, a probe soln. -accommodating container-mounting 
part, a container-mounting part, and the resp. processing tank. On the vertical stage, 
installed is a probe soln. -supplying means (reaction liq. -supplying means). In addn., a 
supply circuit, a discharge circuit, and a temp.-regulating means are mainly installed 
inside the app. main body. Diagrams describing the app. assembly are given. 
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AB A microarray processing app. for a substance to be processed (e.g., nucleic acid) 
is provided, with which the operation for processing a substance to be processed is 
easy, and the labor and time required for the operation is reduced. The processing 
app. for a substance to be processed (microarray processing app.) comprises an app. 
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main body equipped with two sealable processing tanks (reaction tanks), a personal 
computer, and four processing units in total accommodated in the resp. processing 
tank in such a way that they are freely mounted or detached. The app. main body is 
equipped with a horizontal stage and a vertical stage. On the horizontal stage, 
installed are a chip-mounting part, a probe soln.-accommodau'ng container- mounting 
part, a container-mounting part, and the resp. processing tank. On the vertical stage, 
installed is a probe soln.-supplying means (reaction liq.-supplying means). In addn., a 
supply circuit, a discharge circuit, and a temp.-regulating means are mainly installed 
inside the app. main body. Diagrams describing the app. assembly are given. 
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AB This study proposes a stamper array chip with embedded microchannels that 
delivers fixed size and shape liq. samples to a bottom chip for quant, biodiagnosis and 
bioassays. The transfer process and physics are analyzed by solving first-principle 
equations numerically. The simulation proves that the surface tension force inside a 
microchannel plays an important role in driving the liq. fluid from the reservoir to the 
tip of the microchannel and causes some degree of liq.-air interface oscillation due to 
the interaction of a pressure wave and the surface tension force. The oscillation of the 
meniscus-free surface helps the delivery of the liq. to the bottom chip by forming 
microchannels and attaching to the surface. Most of all, the simulation of the stamping 
process indicates that the control of spot size transferred to the bottom surface is 
feasible for precise diagnosis under different stamping speeds and/or various contact 
angles due to different surface tension coeffs. between fluids and solid surfaces. 
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AB Microarrays have emerged as the premier tool for studying gene expression on a 
genomic scale. Scientists seeking to harness the potential of this technique are often 
challenged by the large quantities of data produced. In support of their ongoing work 
in microarray anal, of gene expression, the authors developed a suite of software that 
allow users in the lab. to capture, manage, and analyze effectively data from DNA 
microarray expts. The TM4 suite of tools consist of four major applications: Microarray 
Data Manager (MADAM), TIGR_Spotfinder, Microarray Data Anal. System (MIDAS), and 
Multiexperiment Viewer (MeV), as well as a Minimal Information About a Microarray 
Expt. (MIAME)-compliant MySQL database. The TM4 software system represents a 
comprehensive, extensible, open-source, and freely available collection of tools that will 
be of use to a wide range of labs, conducting microarray expts. 
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AB A method and system for signal generation and signal amplification from an array 
contg. bound, unlabeled target mols. Following exposure of the array to a sample soln. 
contg. unlabeled target RNA mols., blunt ends are generated on each probe/target 
double-stranded hybrid labeled primer oligonucleotide linker is then bound to the blunt 
ends. Next, in an iterative, inner process, addnl. layers of labeled oligonucleotide, 
linkers are added, shell-by-shell, to form a dendrimer-like mol. complex bound through 
the oligonucleotide linker to the probe/target hybrid. 
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AB The app. has a unit for multistep-processing of a plurality of samples and a 
recording unit contg. memory devices which store the information of the biol. samples, 
the information of the process steps, and the information of the results of processing. 
The app. and computer program are useful for pretreatment including extn., 
amplification by PCR, diln., and modification of nucleic acids, etc., before 
electrophoresis. Diagrams of the app. and flowcharts for PCR-SSCP (single-strand 
conformation polymorphism) and gene anal, are given. 
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AB We present a theor. thermodn. framework for the design of more efficient 
oligonucleotide microarrays. A general thermodn. relation is derived to describe the 
electrostatic surface effects on the binding of the assayed biomol. to a surface-tethered 
mol. probe. The relation is applied to analyze how the nucleic acid target, the 
oligonucleotide probe, and their DNA duplex electrostatic interactions with the surface 
affect the hybridization on DNA arrays. Taking advantage of a closed form exact soln. 
of the linear Poisson-Boltzmann equation for a charged ion-penetrable sphere in 
electrolyte soln. interacting with a plane wall, we study the effects of the surface and 
soln. conditions. Binding free energy is found as a function of the surface material, 
dielec. or metal, the surface charge d., linker mol. length, temp., and added salt 
content. The charge or elec. potential of the dielec. or metal surface, resp., is shown to 
dominate the hybridization, esp. at low added salt or short linker length. We predict 
that substantial enhancement of sensitivity, selectivity, and reliability of microarrays 
can be achieved by control of the surface conditions. As examples, we discuss how to 
overcome two limitations of current technologies: nonequal sensitivity of the probes 
with different GC and AT bases content, and poor match/ mismatch discrimination. In 
addn., we suggest the design of microarray conditions where the tested nucleic acid is 
unfolded, thus making possible the screening of a larger sequence with single 
nucleotide resoln. These promising findings are discussed and further exptl. tests 
suggested. 
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AB DNA microarray anal, of B cell subsets has identified comprehensive programs of 
gene expression that distinguish B cells at discrete stages of differentiation. The next 
task is to identify key genetic signals within these complex programs that regulate the 
dynamic cellular events during B cell activation in vivo. After stimulation with antigen, 
naive B cells proliferate and differentiate, and then produce antibodies. Crucial qual. 
differences in antibody responses are obsd. depending on whether or not B cells 
receive T cell help during activation. Proteins, lipopolysaccharides, and polysaccharides 
stimulate T-dependent (TD), T-independent type 1 (TI-1), and type 2 (TI-2) antibody 
responses, resp. Only TD responses generate somatically mutated antibody-forming 
(plasma) cells and memory B cells, which produce high affinity anamnestic responses 
to subsequent antigen challenge. Somatic mutation of Ig genes occurs during B cell 
proliferation in germinal centers (GC), which are typical in TD responses but rare in TI 
responses. However, we have described a model, which is exceptional because 
numerous large GC form in response to a model TI-2 antigen, (4-hydroxy-3- 
nitrophenyl) acetyl (NP)-Ficoll. Significantly, these GC undergo involution before 
memory B cells are generated. This model provides an opportunity to investigate the 
genetic signals that drive memory cell formation, and we have compared global gene 
expression in TI and TD GC to identify a relatively small no. of genes that are 
differentially expressed between the two prototype B cell responses. This model 
demonstrates how genome-scale technol. can be adapted to investigate specific 
aspects of B cell biol. 
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AB Recent development of technologies (e.g., microarray technol.) that are capable of 
producing massive amts. of genetic data has highlighted the need for new pattern 
recognition techniques that can mine and discover biol. meaningful knowledge in large 
data sets. Many researchers have begun an endeavor in this direction to devise such 
data- mining techniques. As such, there is a need for survey articles that periodically 
review and summarize the work that has been done in the area. This article presents 
one such survey. The first portion of the paper is meant to provide the basic biol. 
(mostly for non-biologists) that is required in such a project. This part is only meant to 
be a starting point for those experts in the tech. fields who wish to embark on this new 
area of bioinformatics. The second portion of the paper is a survey of various data- 
mining techniques that have been used in mining microarray data for biol. knowledge 
and information (such as sequence information). This survey is not meant to be 
treated as complete in any form, since the area is currently one of the most active, and 
the body of research is very large. Furthermore, the applications of the techniques 
mentioned here are not meant to be taken as the most significant applications of the 
techniques, but simply as examples among many. 
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AB The following study outlines the results of printing expts. conducted with a variety 
of DNA and protein samples for the optimization of microarray d. and the correlation of 
feature size with the predictive model. 
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AB The anal, of the leukemia data from Whitehead/MIT group is a discriminant anal, 
(also called a supervised learning). Among thousands of genes whose expression 
levels are measured, not all are needed for discriminant anal. A gene may either not 
contribute to the sepn. of two types of tissues/cancers, or it may be redundant because 
it is highly correlated with other genes. There are two theor. frameworks in which 
variable selection (or gene selection in our case) can be addressed. The first is model 
selection, and the second is model averaging. We have carried out model selection 
using Akaike information criterion and Bayesian information criterion with logistic 
regression (discrimination, prediction, or classification) to det. the no. of genes that 
provide the best model. These model selection criteria set upper limits of 22.apprx.25 
and 12.apprx.13 genes for this data set with 38 samples, and the best model consists 
of only one (no.4847, zyxin) or two genes. We have also carried out model averaging 
over the best single-gene logistic predictors using three different wts.: maximized 
likelihood, prediction rate on training set, and equal wt. We have obsd. that the 
performance of most of these weighted predictors on the testing set is gradually 
reduced as more genes are included, but a clear cutoff that separates good and bad 
prediction performance is not found. 
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AB We describe a method to improve the classification of microarray data presented 
in Golub et al. (1999) through the anal, of present vs. absent calls in selectively 
expressed genes. This method does not rely on scaling or normalization factors in the 
comparison of data across subjects. Several genes in the Golub et al. (1999) dataset 
are selectively expressed between acute myeloid leukemia (AML) and acute 
lymphoblastic leukemia (ALL). We show that the presence or absence of expression in 
the 30 to 100 most selective genes is sufficient to correctly classify and diagnose the 
disease state of the subjects in the training dataset, and to with only one error in the 
independent set. In this initial anal., the level of gene expression is not used. The 
exemplar, or cluster center, for each of the two diseases is computed as the real- 
valued av. of the (expressed/not expressed) binary values for each of the most 
selective genes of the subjects with each disease (27 ALL, 11 AML). The Euclidean 
distance to each of the two exemplars is then computed for each subject. Members of 
a cluster are closer (smaller in distance) to the exemplar of that cluster than to the 
exemplar of another. For example, the range of distances of the 10 most selective 
genes from the AML subjects in the training set to the AML exemplar is 0.05 to 0.28, 
and to the ALL exemplar is 0.62 to 0.87. These data, along with the distances of the 
ALL subjects from the two exemplars show that the two dusters are well sepd., with no 
overlap. 
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AB A review with refs. We are facing an information explosion in the biomedical 
sciences. For example, our ability to measure the expression levels of thousands of 
different genes simultaneously in a particular cell or tissue has far outpaced our ability 
to store, manage, and analyze the data being generated. In this review, we explore 
the use of evolutionary computation for dealing with some of the difficult statistical and 
computational challenges that have resulted from the development and implementation 
of new technologies such as DNA microarrays. We review genetic algorithms and 
genetic programming as evolutionary computation strategies that have been applied to 
the anal, of DNA microarray data. 
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AB Microarray expts. provide the scientific community with huge amts. of data. 
Without appropriate methodologies and tools, significant information and knowledge 
hidden in these data may not be discovered. Therefore, there is a need for methods 
capable of handling and exploring large data sets. The field of data mining and 
machine learning provides a wealth of methodologies and tools for analyzing large data 
sets. Two classical machine learning techniques suitable for microarray anal, are 
described, namely decision trees and artificial neural networks. An outline of how 
these approaches can be used into a wider data mining framework is presented. 
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AB The invention related to a computer software program to design an optimum 
o I igo- nucleic acid base sequence candidate from nucleic acid base sequences being 
analyzed using a computer for DNA chip. The program comprises a first command to 
receive the specification of resp. tolerated ranges of double-chain bond temp., base 
sequence length and GC content, and to store the information on the priority order of 
resp. items in the memory. The program comprises a second command, while 
extending the partial sequence in the aforementioned nucleic acid base sequences 
being analyzed, to det. whether or not a sequence in each length falls within resp. 
tolerated ranges based on the priority items received by the aforementioned first 
command, and if it does fall within the ranges, to output the partial sequence in the 
applicable length as an oligo-nucleic acid base sequence candidate. The program 
comprises a third command to display, based on the aforementioned priority order, the 
oligo-nucleic acid sequence candidate outputted by the aforementioned second 
command along with the values of resp. items. 
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AB A review on the tools for the data preprocessing and mining in DNA microarray 
and DNA chip anal.; cluster anal, of gene expression by Cluster/TreeView, MeV, and R; 
and a search tool for genes with similarities in expression profiles. READ (RIKEN 
expression array database; expression profile data from the RIKEN mouse cDNA 
microarray) and RINGENE (READ integrates gene expression neighbor) are briefly 
introduced. 
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AB Background: Whereas genome sequencing has given the authors' high-resoln. 
pictures of many different species of bacteria, microarrays provide a means of 
obtaining information on genome compn. for many strains of a given species. 
Genome-compn. anal, using microarrays, or "genomotyping", can be used to categorize 
genes into "present" and "divergent" categories based on the level of hybridization 
signal. This typically involves selecting a signal values that is used as a cutoff to 
discriminate present (high signal) and divergent (low signal) genes. Current methodol. 
uses empirical detn. of cutoffs for classification into these categories, but this 
methodol. is subject to several problems that can result in the misclassification of many 
genes. Results: the authors describe a method that depends on the shape of the 
signal-ratio distribution and does not require empirical detn. of a cutoff. Moreover, the 
cutoff is detd. on an array-to-array basis, accounting for variation in strain compn. and 
hybridization quality. The algorithm also provides an est. of the probability that any 
given gene is present, which provides a measure of confidence in the categorical 
assignment. Conclusions: Many genes previously classified as present using static 
methods are in fact divergent on the basis of microarray signal; this is cor. by the 
algorithm. The authors have reassigned hundreds of genes from previous 
genomotyping studies of Helicobacter pylori and Campylobacter jejuni strains, and 
expect that the algorithm should be widely applicable to genomotyping data. 
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AB The "spotting" microarray technique, consisting in large sets of DNA sequences 
spotted on poly-L-fysine-coated glass microscope slides, has been developed to 
comparatively analyze genome-wide patterns of mRNA expression. It is now a valuable 
tool employed in order to quant, monitor gene expression profiles, as well as to analyze 
the alterations produced in case of genetic diseases, or induced by different 
treatments, abnormal nutrition and toxin. Our group improved the std. protocol as well 
as the results spreadsheet, adding new expts. and math, processing procedures in 



order to increase the accuracy of the data and get new information. In this 
contribution, we propose and verify two procedures to correct the spot ratios and a 
new protocol to get the normal variability of the digital gene expression. 
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AB A microarray biochem. anal, system with an improved efficiency is provided, with 
which a visual observation is also realized. The biochem. anal, system comprises a 
point excitation means for generating an accelerated phosphorescence luminescent 
light by irradiating an excitation light to a site on an accumulative fluorescent body 
sheet corresponding to each hole on a membrane, a detection app. for reading the 
accelerated phosphorescence luminescent light generated at the resp. site and 
obtaining the numerical value data for each spot, a numerical value data anal. app. for 
performing an anal, with the numerical value data, a simulated imaging data formation 
app. for forming the simulated imaging data based on the numerical value data, and a 
computer equipped with a monitor for displaying the simulated imaging data as an 
image, and an imaging data anal, software. A flow diagram describing the system 
assembly is given. 
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AB An instrument for generating a tissue microarray includes a coring tool for coring 
and removing a sample core from the tissue sample contained in the donor block. An 
image capture device for capturing a histol. image of a fixed section of tissue sample, 
corresponding to the tissue sample contained in the donor block, from a sample slide is 
further provided. A processor is coupled to the image capture device and can receive 
the histol. image of the fixed section of tissue sample from the image capture device. 
A display is coupled to the processor for displaying the histol. image. A user interface 
is coupled to the control system to allow a user to select from the displayed histol. 
image a location for coring and removing a sample core. 

L6 ANSWER 221 OF 407 CAPLUS COPYRIGHT 2006 ACS on STN 

AN 2003:51981 CAPLUS 

DN 139:21687 

TI Microarray data assembler 

AU Anbazhagan, Ramswamy 

CS Departments of Pathology and Oncology, Johns Hopkins University School of 
Medicine, Baltimore, MD, 21231, USA 

SO Bioinformatics (2003), 19(1), 157-158 CODEN: BOINFP; ISSN: 1367-4803 
PB Oxford University Press 
DT Journal 
LA English 

AB Large vols, of microarray data are generated and deposited in public databases. 
Most of this data are in the form of tab-delimited text files or Excel spreadsheets. 
Combing data from several of these files to reanalyze these data sets is time 
consuming. Microarray Data Assembler is specifically designed to simplify this task. 
The program can list files and data sources, convert selected text files into Excel files 
and assemble data across multiple Excel worksheets and workbooks. This program 
thus makes data assembling easy, saves time and helps avoid manual error. 
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DT Journal 
LA English 

AB A distributable software package, called The Microarray Database of Gene 
Expression (MADGE), has been designed to track and store the various pieces of data 
generated by a cDNA microarray facility. This includes the done collection storage 
data, annotation data, work-flow queues, microarray data, data repositories, sample 
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submission information, and project/investigator information. This application was 
designed using a 3-tier dient server model. The data access layer (1st tier) contains 
the relational database system tuned to support a large no. of transactions. The data 
services layer (2nd tier) is distributed COM server with full database transaction 
support. The application layer (3rd tier) is an Internet based user interface that 
contains both dient and server side code for dynamic interactions with the user. 
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AB Identification of essential genes is one of the ultimate goals of drug designs. Here 
we introduce an in silico method to select essential genes through the microarray 
assay. We construct a graph of genes, called the gene transcription network, based on 
the Pearson correlation coeff. of the microarray expression level. Unks are connected 
between genes following the order of the pair- wise correlation coeffs. We find that 
there exist two meaningful fractions of links connected, pm and ps, where the no. of 
clusters becomes max. and the connectivity distribution follows a power law, cl resp. 
Interestingly, one of clusters at pm contains a high d. of essential genes having almost 
the same functionality. Thus the deletion of all genes belonging to that duster can 
lead to lethal inviable mutant efficiently. Such an essential cluster can be identified in a 
self-organized way. Once we measure the connectivity of each gene at ps. Then using 
the property that the essential genes are likely to have more connectivity, we can 
identify the essential cluster by finding the one having the largest mean connectivity 
per gene at pm. 
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AB We present an anal, of phys. chem. constraints on the accuracy of DNA micro- 
arrays under equil. and nonequil. conditions. At the beginning of the article we 
describe an algorithm for choosing a probe set with high specificity for targeted genes 
under equil. conditions. The algorithm as well as existing methods is used to select 
probes from the full Saccharomyces cerevisiae genome, and these probe sets, along 
with a randomly selected set, are used to simulate array expts. and identify sources of 
error. Inasmuch as specificity and sensitivity are max. at thermodn. equil., we are 
particularly interested in the factors that affect the approach to equil. These are 
analyzed later in the article, where we develop and apply a rapidly executable method 
to simulate the kinetics of hybridization on a solid phase support. Although the 
difference between soln. phase and solid phase hybridization is of little consequence for 
specificity and sensitivity when equil. is achieved, the kinetics of hybridization has a 
pronounced effect on both. We first use the model to est. the effects of diffusion, 
cross-hybridization, relaxation time, and target concn. on the hybridization kinetics, and 
then investigate the effects of the most important kinetic parameters on spedfidty. 
We find even when using probe sets that have high specificity at equil. that substantial 
cross-hybridization is present under nonequil. conditions. Although those complexes 
that differ from perfect complementarity by more than a single base do not contribute 
to sources of error at equil., they slow the approach to equil. dramatically and 
confound interpretation of the data when they dissoc. on a time scale comparable to 
the time of the expt. For the best probe set, our simulation shows that steady-state 
behavior is obtained in a relaxation time of .apprx.12-15 h for exptl. target concns. 
.apprx.(10-13 - 10-14)M, but the time is greater for lower target concns. in the range 
(10-15-10-16)M. The result points to an asymmetry in the accuracy with which up- 
and down-regulated genes are identified. 
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AB Modeling the relationship between genomic features and therapeutic response is 
of central interest in pharmacogenomics [Musumarra et al., 2001]. The NCI-60 cancer 
data set with both gene expression and drug activity measurements provides an 
excellent opportunity for this modeling exerdse. To correlate the gene expression 
profile with the drug activity pattern, we utilized a soft modeling technique called 
Partial Least Squares (PLS) [Tobias, 2000]. Soft modeling requires less stringent 
assumptions about the data than other modeling techniques [Falk et al., 1992]. A high 
level of collinearity in multi-dimensional gene expression profiles motivates us to 
undertake the PLS approach, which not only trims data redundancy but also exposes 
the underlying hidden functional units as latent features. It is believed that these 
functional gene groups play a key role in detg. the efficacy of the cancer drugs to 
different cell lines (types of cancer). We have shown the efficacy of PLS in identifying 
drug resistant and drug sensitive genes. We have also investigated techniques to 
exploit the non-linear dependence between individual gene expressions in order to 
explain variations in the drug activity pattern. This is facilitated by a kernel function 
that implicitly carries out the regression in a higher-dimensional space where the data 
is linear [Christiannini et al., 2000]. The kernel-based non-linear approach is shown to 
be more effective in defining the correlation between the drug response and the gene 
expressions. The PLS approach, as implemented here, could be used to differentiate 
cancer cell lines between renal cancer and melanoma, for example, or different drug 
groups like Alkylating agents and Tubulin-active anti-mitotic agents. 
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AB The Rosetta data set opens the possibility of comparing an exptl. microarray data 
set with a ref. profile from the compendium. However, explaining this comparison in 
terms of individual genes could be a daunting task because of the sheer no. of genes. 
Thus, we postulate a new strategy of modeling microarray data in terms of functional 
genomic units (FGUs). A functional genomic unit is a group of genes that carries out a 
certain biol. function. We explored the possibility of defining the functional genomic 
units from the Gene Ontol. (GO) annotation of the yeast genome. To visualize the tree 
structure of the GO, we have written a yeast genomic knowledge browser in Java, and 
integrated it with the microarray data. The pitfall of using the GO is that only a portion 
of the genes in the genome are functionally known or inferred. Thus, we further 
investigated an unsupervised learning method to identify those functional genomic 
units in the yeast genome. We have applied an established anal, method from digital 
signal processing, Independent Component Anal. (ICA), to the Rosetta data set. To 
further validate the utility of the Rosetta compendium, we have designed an expt. to 
investigate the yeast cells transfected with human Racl, a small GTPase protein of the 
Rho family, and demonstrated that functional genomic units helped us to corroborate 
our own microarray expt. with the Rosetta data set. 
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AB A review. Now that the human genome has been sequenced, the measurement, 
processing, and anal, of specific genomic information in real time are gaining 
considerable interest because of their importance to better the understanding of the 
inherent genomic function, the early diagnosis of disease, and the discovery of new 
drugs. Traditional methods to process and analyze DNA (DNA) or RNA data, based on 
the statistical or Fourier theories, are not robust enough and are time-consuming, and 
thus not well suited for future routine and rapid medical applications, particularly for 
emergency cases. An overview of some recent applications of signal processing 
techniques for DNA structure prediction, detection; feature extn., and classification of 
differentially expressed genes is presented. Emphasis is placed on the application of 
wavelet transform in DNA sequence anal, and on cellular neural networks in microarray 
image anal., which can have a potentially large effect on the real-time realization of 
DNA anal. Finally, some interesting areas for possible future research are summarized, 
which indude a biomodel-based signal processing technique for genomic feature extn. 
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and hybrid multidimensional approaches to process the dynamic genomic information 
in real time. 
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AB Most diseases are caused by a set of gene defects, which occur in a complex 
assocn. The assocn. scheme of expressed genes can be modeled by genetic networks. 
Genetic networks efficiently facilitate understanding the dynamics of pathogenic 
processes by modeling mol. reality of cell conditions. In this sense a genetic network 
consists of first, a set of genes of specified cells, tissues or species, and second, causal 
relationships between these genes detg. the functional condition of the biol. system 
under disease. A relationship between two genes will exist if they both are directly or 
indirectly assocd. with disease [8]. Our goal is to characterize diseases (esp. 
autoimmune diseases like chronic pancreatitis CP, multiple sclerosis MS, rheumatoid 
arthritis RA) by genetic networks generated by a computer system. We want to 
introduce this practice as a bioinformatic approach for finding targets. 
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AB The present invention relates to genetic markers whose expression is correlated 
with breast cancer. Specifically, the invention provides sets of markers whose 
expression patterns can be used to differentiate din. conditions assocd. with breast 
cancer, such as the presence or absence of the estrogen receptor ESR1, and BRCA1 
and sporadic tumors, and to provide information on the likelihood of tumor distant 
metastases within five years of initial diagnosis. The invention relates to methods of 
using these markers to distinguish these conditions. The invention also relates to kits 
contg. ready-to-use ***microarrays*** and ***computer*** software for data 
anal, using the statistical methods disclosed herein. 
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AB This paper reports the methods and results of a computer-based search for causal 
relationships in the gene-regulation pathway of galactose metab. in the yeast 
Saccharomyces cerevisiae. The search uses recently published data from cDNA 
microarray expts. A Bayesian method was applied to leam causal networks from a 
mixt. of observational and exptl. gene-expression data. The observational data were 
gene-expression levels obtained from unmanipulated "wild-type" cells. The exptl. data 
were produced by deleting ("knocking out") genes and observing the expression levels 
of other genes. Causal relations predicted from the anal, on 36 galactose gene pairs 
are reported and compared with the known galactose pathway. Addnl. exploratory 
analyses are also reported. 
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AB The invention provides compns. and methods for the detection, identification, and 
quantification of microorganisms, cells, or protein mixts. in a sample. 
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AB Microarrays specific for breast cancer are provided. Also provided are methods for 
detecting breast cancer in patients or screening therapeutics for the treatment or 
prevention of breast cancer by analyzing expression levels of specific genes in BECs or 
quantifying specific protein levels in breast ductal fluid. 
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AB Using microarrays is a powerful technique to monitor the expression of thousands 
of genes in a single expt. From series of such expts., it is possible to identify the 
mechanisms that govern the activation of genes in an organism. Short DNA patterns 
(called binding sites) near the genes serve as switches that control gene expression. 
As a result similar patterns of expression can correspond to similar binding site 
patterns. Here we integrate clustering of coexpressed genes with the discovery of 
binding motifs. We overview several important clustering techniques and present a 
clustering algorithm (called adaptive quality-based clustering), which we have 
developed to address several shortcomings of existing methods. We overview the 
different techniques for motif finding, in particular the technique of Gibbs sampling, 
and we present several extensions of this technique in our Motif Sampler. Finally, we 
present an integrated web tool called INCLUSive that allows the easy anal, of 
microarray data for motif finding. 
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AB The creation of tissue microarrays (TMAs) allows for the rapid immunohistochem. 
anal, of thousands of tissue samples, with numerous different antibodies per sample. 
This tech. development has created a need for tools to aid in the anal, and archival 
storage of the large amts. of data generated. We have developed a comprehensive 
system for high-throughput anal, and storage of TMA immunostaining data, using a 
combination of com. available systems and novel software applications developed in 
our lab. specifically for this purpose. Staining results are recorded directly into an Excel 
worksheet and are reformatted by a novel program (TMA-Deconvoluter) into a format 
suitable for hierarchical clustering anal, or other statistical anal. Hierarchical clustering 
anal, is a powerful means of assessing relatedness within groups of tumors, based on 
their Immunostaining with a panel of antibodies. Other analyses, such as generation of 



Page 36 of 63 



Serial No. 10/501,848 
STN SEARCH - a 



survival curves, construction of Cox regression models, or assessment of intra- or 
interobserver variation, can also be done readily on the reformatted data. Finally, the 
immunoprofile of a specific case can be rapidly retrieved from the archives and 
reviewed through the use of Stainhnder, a novel web-based program that creates a 
direct link between the clustered data and a digital image database. An online 
demonstration of this system is available at http://genome- 
www.stanford.edu/TMA/explore.shtml. 
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AB A method for estg. the factor and interaction effects for gene expression 
microarray expts. is disclosed. The method requires the inversion of two square 
matrixes of size p and p', resp., instead of a matrix of size q where q>>p.apprxeq.p\ 
The factors causing variance in microarray dataset include gene-related factors, like 
orthogonality and other factors in the expt, and non-gene factors, like variety factor, 
dye factor and array factor. In particular, disclosed is a method for estg. at least one 
gene-variety interaction in a gene expression microarray expt. having an expo*, design 
characterized by a no. of degrees of freedom, q, and defined by a gene factor, a 
plurality of non-gene factors, a plurality of two-factor interactions wherein a full 
replication of genes is present for every combination of the plurality of non-gene 
factors, the method comprising the steps of : (a) inverting a first square matrix 
characterized by a size, p, wherein p<q; (b) estg. at least one of a plurality of non- 
gene factor effect from the first square matrix inverse; (c) constructing a second 
square matrix based in part on the estd. non-gene factor, the second square matrix 
characterized by size, p', wherein p'<q; (d) inverting a second square matrix; and (e) 
estg. at least one gene-variety interaction from the inverted second square matrix. 
Also disclosed is a method for estg. at least one gene-variety interaction in a gene 
expression microarray expt. generating a dataset and having a design characterized by 
a arrays, v varieties, n genes, and d dyes wherein a full replication of genes is present 
for every combination of arrays, varieties and dyes, the method comprising the steps of 
: (a) constructing a global data vector, d, based on a plurality of a vs. of the dataset; 
(b) constructing a square matrix, T, characterized by a size, p, wherein p=a+v+d-3; (c) 
inverting the square matrix, T; (d) estg. the global effects, T, wherein T=T d; (e) 
constructing a square matrix, Tg, characterized by a size, p\ wherein p'=p-l; (0 
constructing a gene-specific data vector, dg, based on a plurality of avs. of the dataset; 
(g) inverting the square matrix, Tg; and (g) estg. the gene-variety interaction, .tau.g, 
wherein .tau.g=Tgdg. Furthermore, disclosed is a method of non-gene factor 
interaction wherein the transformed dataset is created according to the equation : Xijkg 
= Yijkgs - .mu.~ - Ai~ - Dj~ - (AD)ij, where Xijkgs is the transformed measurement of 
Yijkg measurement, Yijkg is the ijkgsth measurement; .upsilon.~ is the estd. mean of 
all measurements; Ai~ is the estd. array effect for the ith array; Dj~ is the estd. dye 
effect for the jth dye; (AD)ij is the estd. array-dye interaction effect of the ith array and 
the jth dye. The invention also includes implementation of the methods in computer 
software, computer readable media comprising these software instructions, and 
computer systems for performing the methods. 
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AB A single microarray can provide information on the expression of tens of 
thousands of genes. The amt. of information generated by a microarray based expt. is 
sufficiently large that no single study can be expected to mine each nugget of scientific 
information. As a consequence, the scale and complexity of ***microarray*** 
expts. require that ***computer*** software programs do much of the data 
processing, storage, visualization, anal, and transfer. The adoption of common stds. 
and ontologies for the management and sharing of microarray data is essential and will 
provide immediate benefit to the research community. 
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AB The goal of clustering is to organize microarray data so that the underlying 
structures can be recognized and explored. Four aspects of clustering gene expression 
are presented here (1) microarray data structure, (2) the Ouster and Tree View 
software packages, (3) types of clustering and math, principles, and (4) adjusting and 
filtering data. 
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AB A review addresses data management of the results generated from glass slide 
microarrays that have been spotted with DNA mols. Topics discussed include the 
information to be captured by a lab. information management system; requirements of 
results databases; and selecting database software. 
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AB Background: The rapidly expanding fields of genomics and proteomics have 
prompted the development of computational methods for managing, analyzing and 
visualizing expression data derived from microarray screening. Nevertheless, the lack of 
efficient techniques for assessing the biol. implications of gene-expression data remains 
an important obstacle in exploiting this information. Results: To address this need, the 
authors have developed a mining technique based on the anal, of literature profiles 
generated by extg. the frequencies of certain terms from thousands of abstrs. stored in 
the Medline literature database. Terms are then filtered on the basis of both repetitive 
occurrence and co-occurrence among multiple gene entries. Finally, clustering anal, is 
performed on the retained frequency values, shaping a coherent picture of the 
functional relationship among large and heterogeneous lists of genes. Such data 
treatment also provides information on the nature and pertinence of the assoens. that 
were formed. Conclusions: The anal, of patterns of term occurrence in abstrs. 
constitutes a means of exploring the biol. significance of large and heterogeneous lists 
of genes. This approach should contribute to optimizing the exploitation of microarray 
technologies by providing investigators with an interface between complex expression 
data and large literature resources. 
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SO Proceedings of the National Academy of Sciences of the United States of America 
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LA English 

AB A major challenge in DNA microarray anal, is to effectively dissoc. actual gene 
expression values from exptl. noise. We report here a detailed noise anal, for 
oligonucleotide-based microarray expts. involving reverse transcription, generation of 
labeled cRNA (target) through in vitro transcription, and hybridization of the target to 
the probe immobilized on the substrate. By designing sets of replicate expts. that 
bifurcate at different steps of the assay, we are able to sep. the noise caused by 
sample prepn. and the hybridization processes. We quant, characterize the strength of 
these different sources of noise and their resp. dependence on the gene expression 
level. We find that the sample prepn. noise is small, implying that the amplification 
process during the sample prepn. is relatively accurate. The hybridization noise is 
found to have very strong dependence on the expression level, with different 
characteristics for the low and high expression values. The hybridization noise 
characteristics at the high expression regime are mostly Poisson-like, whereas its 
characteristics for the small expression levels are more complex, probably due to cross- 
hybridization. A method to evaluate the significance of gene expression fold changes 
based on noise characteristics is proposed. 
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PB Oxford University Press 
DT Journal 
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AB Two-color microarray expts. in which an aliquot derived from a common RNA 
sample is placed on each array are called ref. designs. Traditionally, microarray expts. 
have used ref. designs, but designs without a ref. have recently been proposed as 
alternatives. We develop a statistical model that distinguishes the different levels of 
variation typically present in cancer data, including biol. variation among RNA samples, 
exptl. error and variation attributable to phenotype. Within the context of this model, 
we examine the ref. design and two designs which do not use a ref., the balanced 
block design and the loop design, focusing particularly on efficiency of ests. and the 
performance of cluster anal. We calc. the relative efficiency of designs when there are 
a fixed no. of arrays available, and when there are a fixed no. of samples available. 
Monte Carlo simulation is used to compare the designs when the objective is class 
discovery based on cluster anal, of the samples. The no. of discrepancies between the 
estd. clusters and the true clusters were significantly smaller for the ref. design than for 
the loop design. The efficiency of the ref. design relative to the loop and block designs 
depends on the relation between inter- and intra-sample variance. These results 
suggest that if cluster anal, is a major goal of the expt, then a ref. design is preferable. 
If identification of differentially expressed genes is the main concern, then design 
selection may involve a consideration of several factors. 
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AB ARROGANT (ARRay OrGANizing Tool) is a software tool developed to facilitate the 
identification, annotation and comparison of large collections of genes or clones. The 
objective is to enable users to compile gene/clone collections from different databases, 
allowing them to design expts. and analyze the collections as well as assocd. exptl. 
data efficiently. ARROGANT can relate different sequence identifiers to their common 
ref. sequence using the UniGene database, allowing for the comparison of data from 
two different microarray expts. ARROGANT has been successfully used to analyze 
microarray expression data for colon cancer, to compile genes potentially related to 
cardiac diseases for subsequent resequencing (to identify single nucleotide 
polymorphisms, SNPs), to design a new comprehensive human cDNA microarray for 
cancer, to combine and compare expression data generated by different microarrays 
and to provide annotation for genes on custom and Affymetrix chips. 
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CS Departments of Medicine, Molecular and Human Genetics, Baylor College of 
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AB The authors used scaled factorial moments to search for intermittent in the log 
expression ratios (LERs) for thousands of genes spotted on cDNA microarrays (gene 
chips). Results indicate varying levels of intermittency in gene expression. The 
observation of intermittency in the data analyzed provides a complimentary handle on 
moderately expressed genes, generally not tackled by conventional techniques. 
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AB A review and discussion. The automated anal, of transcriptional profiling data 
promises a wealth of information that may be used to develop a more complete 
understanding of gene function and interactions. Moreover, it may improve the 
effectiveness of complex diagnostic tasks. This article discusses important data mining 
and management techniques to analyze genome-wide expression data. It reviews 
some of the major discovery goals, methods and applications in a no. of biomedical 
domains. Finally, this paper highlights key problems that need to be approached by a 
new generation of computational solns. 
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AB High-throughput genomic technologies are revolutionizing modern biol. In 
particular, DNA microarrays have become one of the most powerful tools for profiling 
global mRNA expression in different tissues and environmental conditions, and for 
detecting single nucleotide polymorphisms. The broad applicability of gene expression 
profiling to the biol. and medical realms has generated expanding demand for mass 
prodn. of microarrays, which in turn has created considerable interest in improving the 
cost effectiveness of microarray fabrication techniques. The authors have developed 
the computational framework for an optimal synthesis strategy for oligonucleotide 
microarrays. The problem was introduced by Hubbell et al. Here, the authors formalize 
the problem, obtain precise bounds on its complexity and devise several computational 
solns. 
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AB Affymetrix microarrays are used by many labs, to generate gene expression 
profiles. Generally, only large differences (> 1.7-fold) between conditions have been 
reported. Computational methods to reduce inter-array variability might be of value 
when attempting to detect smaller differences. We examd. whether inter-array 
variability could be reduced by using data based on the Affymetrix algorithm for 
pairwise comparisons between arrays (ratio method) rather than data based on the 
algorithm for anal, of individual arrays (signal method). Six HG-U95A arrays that 
probed mRNA from young (21-31 yr old) human muscle were compared with six arrays 
that probed mRNA from older (62-77 yr old) muscle. Differences in mean expression 
levels of young and old subjects were small, rarely > 1.5-fold. The mean within-group 
coeff. of variation for 4629 mRNAs expressed in muscle was 20% according to the ratio 
method and 25% according to the signal method. The ratio method yielded more 
differences according to t-tests (124 vs. 98 differences at P < 0.01), rank sum tests 
(107 vs. 85 differences at P < 0.01), and the Significance Anal, of Microarrays method 
(124 vs. 56 differences with false detection rate < 20%; 20 vs. 0 differences with false 
detection rate < 5%). The ratio method also improved consistency between results of 
the initial scan and results of the antibody-enhanced scan. The ratio method reduces 
inter-array variance and thereby enhances statistical power. 
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AB Chitin oligomers, released from fungal cell walls by endochitinase, induce defense 
and related cellular responses in many plants. However, little is known about chitin 
responses in the model plant Arabidopsis. The authors describe here a large-scale 
characterization of gene expression patterns in Arabidopsis in response to chitin 
treatment using an Arabidopsis microarray consisting of 2375 EST clones representing 
putative defense-related and regulatory genes. Transcript levels for 71 ESTs, 
representing 61 genes, were altered three-fold or more in chitin-treated seedlings 
relative to control seedlings. A no. of transcripts exhibited altered accumulation as 
early as 10 min after exposure to chitin, representing some of the earliest changes in 
gene expression obsd. in chitin-treated plants. Included among the 61 genes were 
those that have been reported to be elicited by various pathogen-related stimuli in 
other plants. Addnl. genes, including genes of unknown function, were also identified, 
broadening our understanding of chitin-elicited responses. Among transcripts with 
enhanced accumulation, one cluster was enriched in genes with both the W-box 
promoter element and a novel regulatory element. In addn., a no. of transcripts had 
decreased abundance, encoding several proteins involved in cell wall strengthening and 
wall deposition. The chalcone synthase promoter element was identified in the 
upstream regions of these genes, suggesting that pathogen signals may suppress the 
expression of some genes. These data indicate that Arabidopsis should be an excellent 
model to elucidate the mechanisms of chitin elicitation in plant defense. 
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AB A review. One of the foremost challenges of 21st century biol. research will be to 
decipher the complex genetic regulatory networks responsible for embryonic 
development. The recent explosion of whole genome sequence data and of genome- 
wide transcriptional profiling methods, such as microarrays, coupled with the 
development of sophisticated computational tools for exploiting and analyzing genomic 
data, provide a significant starting point for regulatory network anal. In this article we 
review some of the main methodol. issues surrounding genome annotation, 
transcriptional profiling, and computational prediction of cis-regulatory elements and 
discuss how the power of model genetic organisms can be used to exptl. verify and 
extend the results of genomic research. 

RE.CNT 113 THERE ARE 113 CITED REFERENCES AVAILABLE FOR THIS RECORD 
ALL CITATIONS AVAILABLE IN THE RE FORMAT 

L6 ANSWER 249 OF 407 CAPLUS COPYRIGHT 2006 ACS on STN 
AN 2002:749178 CAPLUS 
DN 138:20101 

TI Bayesian hierarchical model for identifying changes in gene expression from 
microarray experiments 

AU Broet, Philippe; Richardson, Sylvia; Radvanyi, Francois 

CS Faculte de Medecine, Universite Paris XI and INSERM U472, Hopital Paul Brousse, 

Villejuif, 94807, Fr. 

SO Journal of Computational Biology (2002), 9(4), 671-683 CODEN: JCOBEM; ISSN: 
1066-5277 

PB Mary Ann Liebert, Inc. 
DT Journal 
LA English 

AB Recent developments in microarrays technol. enable researchers to study 
simultaneously the expression of thousands of genes from one cell line or tissue 
sample. This new technol. is often used to assess changes in mRNA expression upon a 
specified transfection for a cell line in order to identify target genes. For such expts., 
the range of differential expression is moderate, and teasing out the modified genes is 
challenging and calls for detailed modeling. The aim of this paper is to propose a 
methodol. framework for studies that investigate differential gene expression through 
microarrays technol. that is based on a fully Bayesian mixt. approach. A case study 
that investigated those genes that were differentially expressed in two cell lines 
(normal and modified by a gene transfection) is provided to illustrate the performance 
and usefulness of this approach. 
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AB The present invention provides methods for interfacing computer technol. with 
biol. and chem. processing and synthesis equipment. In preferred embodiments, the 
present invention features methods for the computer to interface with equipment 
useful for biol. and chem. processing and synthesis in a remote manner. 
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AB DNA microarrays produce large amts. of data. Complex changes in gene 
expression are revealed; sometimes thousands of mRNAs change between expts. Here 
we apply modular regulation anal, to microarray data to reveal and quantify the mRNA 
changes that are important for cellular responses. The mRNAs are sorted into clusters. 
How strongly a perturbation alters each cluster is multiplied by how strongly each 
cluster affects an output, to obtain coeffs. that describe how much of the change in the 
output is transmitted through each mRNA cluster. An example published dataset is 
analyzed to reveal that the response (relative fitness*) of yeast to 2-deoxy-D-glucose is 
not transmitted by a single mRNA cluster, but instead many clusters contribute to the 
overall response. The method is applicable to microarray, transcriptome, proteome and 
metabolome data. 
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AB We introduce a max. entropy-based anal, technique for extg. and characterizing 
rhythmic expression profiles from DNA microarray hybridization data. These patterns 
are clues to discovering genes implicated in cell-cycle, circadian, and other periodic 
biol. processes. The algorithm, implemented in a program called ENRAGE (Entropy- 
based Rhythmic Anal, of Gene Expression), treats the task of estg. an expression 
profile's periodicity and phase as a simultaneous bicriterion optimization problem. 
Specifically, a frequency domain spectrum is reconstructed from a time-series of gene 
expression data, subject to two constraints: (a) the likelihood of the spectrum and (b) 
the Shannon entropy of the reconstructed spectrum. Unlike Fourier-based spectral 
anal., max. entropy spectral reconstruction is well suited to signals of the type 
generated in DNA microarray expts. Our algorithm is optimal, running in linear time in 
the no. of expression profiles. Moreover, an implementation of our algorithm runs an 
order of magnitude faster than previous methods. Finally, we demonstrate that 
ENRAGE is superior to other methods at identifying and characterizing periodic 
expression profiles on both synthetic and actual DNA microarray hybridization data. 
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AB The invention provides a device and method for detecting interactions between 
oligonucleotides immobilized on a solid array and mRNAs (cDNAs) that are added to 
the array. The invention relates that the method involves chromophore-labeled mRNAs 
(cDNAs) binding to array oligonucleotides labeled with a second chromophore. The 
invention also relates that the detection of the interaction relies on fluorescence 
resonance energy transfer (FRET) between the two chromophores. The invention 
further relates that FRET can be detected by shining an appropriate laser or other 
suitable controlled light source onto arrays to excite one of the matched pair of 
chromophores. The invention proposed that said device and method can be used for 
detg. structural parameters of native RNA transcripts and for detg. regions that may be 
effective targets for antisense mediated gene knockout. 
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AB The present invention provides methods and means for analyzing, designing, 
selecting and generating oligomer sequences, such as those for use in multiplex array- 
based nucleic acid probe systems, down to the selection of a single pair of optimal 
primer/ target oligomers. Sequences are represented by a function of sequence 
context, called the context functional descriptor. In addn. to the consideration of base 
pairing and nearest-neighbor anal., the present computational methods incorporate the 
use of context functional descriptors and correlation matrixes to account for higher- 
order thermodn. interactions between nucleic acid sequences. The Sequence Design 
Turbo Generator, SEQ-TG, technol. is explained and applied. The SEQ-TG is an anal, 
process comprised of computer-driven algorithms that utilize specified sequence- 
dependent input parameters and user-defined sequence constraints. It provides for de 
novo design of sets of nucleic acid oligomer sequences with precisely defined 
properties, and selection of subsets of sequences from larger sequences sets that have 
the desired predicted properties. SEQ-TG can be applied to generate sequences with 
optimum multiplex compatibility for use on microarrays or in multiplex soln. 
applications, or for purposes of designing optimal and unique probes and primers. 
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AB A review. Biotechnol. break-through to promote research on single biomol. was 
discussed. The coverage of the topics included DNA chip/ ***microarray*** 
technologies, protein engineering, lob on a chip technol., use of ***computer*** 
simulation and bioinformatics. 
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AB A review. Microarray expts. are multi-step processes. At each step - the growth 
of cultures, extn. of mRNA, reverse transcription, labeling, hybridization, scanning, and 
image anal. - variation and error cannot be completely avoided. Estg. the amt. of such 
noise and variation is essential, not only to test for differential expression but also to 
suggest at which level replication is most effective. Replication and averaging are the 
key to the estn. as well as the redn. of variability. Here I discuss the use of ANOVA 
mixed models and of anal, of variance components as a rigorous way to calc. the no. of 
replicates necessary to detect a given target fold-change in expression levels. 
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AB A review. The manuf. and use of a whole-genome microarray is a complex 
process and it is essential that all data surrounding the process is stored, is accessible 
and can be easily assocd. with the data generated following hybridization and scanning. 
As part of a program funded by the Wellcome Trust, the Bacterial Microarray Group at 
St. George's Hospital Medical School (B.mu.G@S) will generate whole-genome 
microarrays for 12 bacterial pathogens for use in collaboration with specialist research 
groups. B.mu.G@S will collaborate with these groups at all levels, including the exptl. 
design, methodol. and anal. In addn., we will provide informatic support in the form of 
a database system (B.mu.G@Sbase). B.mu.G@Sbase will provide access through a 
web interface to the microarray design data and will allow individual users to store their 
data in a searchable, secure manner. Tools developed by B.mu.G@S in collaboration 
with specific research groups investigating anal, methodol. will also be made available 
to those groups using the arrays and submitting data to B.mu.G@Sbase. 
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AB Motivation: Transcriptional profiling using microarrays can reveal important 
information about cellular and tissue expression phenotypes, but these measurements 
are costiy and time consuming. Addnl., tissue sample availability poses further 
constraints on the no. of arrays that can be analyzed in connection with a particular 
disease or state of interest. It is therefore important to provide a method for the detn. 
of the min. no. of microarrays required to sep., with statistical reliability, distinct 
disease states or other physiol. differences. Results: Power anal, was applied to est. 
the min. sample size required for two-class and multi-class discrimination. The power 
anal, algorithm calcs. the appropriate sample size for discrimination of phenotypic 
subtypes in a reduced dimensional space obtained by Fischer discriminant anal. (FDA). 
This approach was tested by applying the algorithm to existing data sets for estn. of 
the min. sample size required for drawing certain conclusions on multi-class distinction 
with statistical reliability. It was confirmed that when the min. no. of samples estd. 
from power anal, is used, group means in the FDA discrimination space are statistically 
different. 
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AB A randomization procedure to evaluate the significance level and the false- 
discovery rate in complex microarray expts. is proposed. A related graph can be used 
to compare different test statistics that can be used to analyze the same expt. This 
graph is closely related to receiver operator characteristic (ROC) curves. The proposed 
method is applied to a subset of the data from a cell-line expt. related to Huntington's 
disease. A small simulation study compares the effectiveness of the proposed 
procedure with the significance anal, of microarrays (SAM) procedure. 
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AB Summary: MArray is a Matlab toolbox with a graphical user interface that allows 
the user to analyze single or paired microarray datasets by direct input of the raw data 
output file from image anal, packages, such as QuantArray or GenePiX. The application 
provides simple procedures to manually evaluate the quality of each measurement 
multiple approaches to both ratio normalization (simple normalization, intensity 
dependent normalization) and evaluation of the reproducibility of paired expts. (using 
the techniques simple statistical method' and quality control ellipse' and significance 
anal, of microarrays'). Specifically, interactive spot evaluation functions are available in 
MArray and an online gene information database (NCBI UniGene) is linked. The 
application may provide a valuable aid in selecting and optimizing expti. procedures, as 
well as serving as an anal, tool for two-state biol. comparisons, such as a study of 
single-dose activation. It is entirely platform independent, and only requires Matlab 
installed. 
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PI WO 2002064743 A2 20020822 WO 2002-US4219 20020212 WO 
2002064743 A3 20021031 W: AE, AG, AL, AM, AT, AU, AZ, BA, BB, BG, BR, 
BY, BZ, CA, CH, CN, CO, CR, CU, CZ, DE, DK, DM, DZ, EC, EE, ES, FI, GB, GD, 
GE, GH, GM, HR, HU, ID, IL, IN, IS, JP, KE, KG, KP, KR, KZ, LC, LK, LR, LS, 
LT, LU, LV, MA, MD, MG, MK, MN, MW, MX, MZ, NO, NZ, OM, PH, PL, PT, RO, RU, 
SD, SE, SG, SI, SK, SL, TJ, TM, TN, TR, TT, TZ, UA, UG, US, UZ, VN, YU, ZA, ZM, 
ZW RW: GH, GM, KE, LS, MW, MZ, SD, SL, SZ, TZ, UG, ZM, ZW, AT, BE, CH, 
CY, DE, DK, ES, FI, FR, GB, GR, IE, IT, LU, MC, NL, PT, SE, TR, BF, BJ, CF, CG, 
CI, CM, GA, GN, GQ, GW, ML, MR, NE, SN, TD, TG AU 2002250066 Al 
20020828 AU 2002-250066 20020212 

PRAI US 200 1-2681 73 P P 20010212 WO 2002-US4219 W 20020212 
AB The invention relates to methods and systems (e.g., computer systems and 
♦♦♦computer*** program products) for identifying and characterizing genes using 
** ♦microarrays*** . In particular, the invention provides for improved, robust 
methods for detecting genes through the use of microarrays to analyze the expression 
state of the genome. Genes which are expressed can be mapped to their resp. 
positions in the genome, and the structure of such genes can be detd. Thus, 
microarrays enable an efficient and comprehensive genome scan that provides much 
more detailed data than prior art methods. The method allows for the efficient 
identification of small genes, genes that do not encoded proteins, genes that are 
transcribed at low levels, and untranslated regions of mRNAs encoding proteins. The 
use of microarrays allows the structure of the gene to be detd. at the same time as the 
gene is detected, even if the gene is spread over larger regions of the genome. Addnl., 
the method allows for the verification of the exon content of a transcript of a particular 
gene through the use of PCR. The method was applied first to known regions of the 
human CXCR4 and Rb genes, then to chromosome 22, in order to identify all exons 
present. Finally, the method was used to survey the entire human genome. 
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AU Wilson, Ann S.; Hobbs, Bridget G.; Speed, Terence P.; Rakoczy, P. Elizabeth 

CS Department of Molecular Ophthalmolgy, Lions Eye Institute, University of Western 

Australia, Ned lands, WA, Australia 
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AB A review. The ***microarray*** is a revolutionary technol. combining mol. 
biol. and ** ♦computer*** technol. in the high throughput, simultaneous anal, of 
global gene expression. It is emerging as a powerful and valuable research tool that 
holds great promise in elucidating the mol. mechanisms involved in complex diseases. 
The information gained may provide direction toward identifying appropriate targets for 
therapeutic intervention. Despite the enormous potential of this technol., however, a 
no. of issues exist that complicate gene expression anal, and require further resoln. 
This paper reviews these issues as well as the conceptual, practical and statistical 
aspects of microarray technol., including its current use in research and din. 
applications. Furthermore, the advantages and potential benefits of this technol. in 
ophthalmic research are discussed, with particular attention to retinal diseases, and its 
possible application in the identification of genes involved in ocular disease progression 
that may serve as din. markers or potential therapeutic targets. 
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AB The last few years have seen the development of DNA microarray technol. that 
allows simultaneous measurement of the expression levels of thousands of genes. 
While many methods have been developed to analyze such data, most have been 
visualization-based. Methods that yield quant, conclusions have been diverse and 
complex. The authors present two straightforward methods for identifying specific 
genes whose expression is linked with a phenotype or outcome variable as well as for 
systematically predicting sample class membership:. (1) A conservative, permutation- 
based approach to identifying differentially expressed genes;. (2) An augmentation of 
K-nearest-neighbor pattern dassification. The analyses replicate the quant, conclusions 
of Golub et al. (Science, 286, 531-537, 1999) on leukemia data, with better 
classification results, using far simpler methods. With the breast tumor data of Perou 
et al. (Nature, 406, 747-752,2000), the methods lend rigorous quant, support to the 
conclusions of the original paper. In the case of the lymphoma data in Alizadeh et al. 
(Nature, 403, 503-511,2000), the analyses only partially support the conclusions of the 
original authors. The software and supplementary information are available freely to 
researchers at academic and nonprofit institutions at http://cc.ucsf.edu/jain/public. 
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AB Background: With the advent of DNA hybridization microarrays comes the 
remarkable ability, in principle, to simultaneously monitor the expression levels of 
thousands of genes. The quant, comparison of two or more microarrays can reveal, for 
example, the distinct patterns of gene expression that define different cellular 
phenotypes or the genes induced in the cellular response to insult or changing 
environmental conditions. Normalization of the measured intensities is a prerequisite of 
such comparisons, and indeed, of any statistical anal., yet insufficient attention has 
been paid to its systematic study. The most straightforward normalization techniques 
in use rest on the implicit assumption of linear response between true expression level 
and output intensity. We find that these assumptions are not generally met, and that 
these simple methods can be improved. Results: We have a robust semi-parametric 
normalization technique based on the assumption that the large majority of genes will 
not have their relative expression levels changed from one treatment group to the 
next, and on the assumption that departures of the response from linearity are small 
and slowly varying. We use local regression to est. the normalized expression levels as 
well as the expression level-dependent error variance. Condusions: We illustrate the 
use of this technique in a comparison of the expression profiles of cultured rat 
mesothelioma cells under control and under treatment with potassium bromate, 
validated using quant. PCR on a selected set of genes. We tested the method using 
data simulated under various error models and find that it performs well. 
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AB Background: Microarray technol. is increasingly being applied in biol. and medical 
research to address a wide range of problems, such as the dassification of tumors. An 
important statistical problem assocd. with tumor dassification is the identification of 
new tumor dasses using gene-expression profiles. Two essential aspects of this 
clustering problem are: to est. the no. of clusters, if any, in a dataset; and to allocate 
tumor samples to these dusters, and assess the confident of duster assignments for 
individual samples. Here we address the first of these problems. Results: We have 
developed a new prediction-based reampling method, Ctest, to est. the no. of clusters 
in a dataset. The performance of the new and existing methods were compared using 
♦♦♦simulated*** data and gene-expression data from four recently published cancer 
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***microarray*** studies. Clest was generally found to be more accurate and robust 
than the six existing methods considered in the study. 
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AB Methods, systems and computer software products are provided for analyzing 
gene expression data using pixel intensities. 
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AB CDNA microarrays provide simultaneous expression measurements for thousands 
of genes that are the result of processing images to recover the av. signal intensity 
from a spot composed of pixels covering the area upon which the cDNA detector has 
been put down. The accuracy of the signal measurement depends on using an 
appropriate algorithm to process the images. This includes detg. spot locations and 
processing the data in such a way as to take into account spot geometry, background 
noise, and various kinds of noise that degrade the signal. This paper presents a 
stochastic model for microarray images. There are over 20 model parameters, each 
governed by a probability distribution, that control the signal intensity, spot geometry, 
spot drift, background effects, and the many kinds of noise that affect microarray 
images owing to the manner in which they are formed. The model can be used to 
analyze the performance of image algorithms designed to measure the true signal 
intensity because the ground truth (signal intensity) for each spot is known. The levels 
of foreground noise, background noise, and spot distortion can be set, and algorithms 
can be evaluated under varying conditions. 
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AB Methods and computer software products are provided for analyzing gene 
expression data. In one embodiment, methods, systems and computer software are 
provided for comparative gene expression anal, using intensity dependent 
normalization factors. 
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AB Microarrays and DNA chips are an efficient, high -throughput technol. for 
measuring temporal changes in the expression of message RNA (mRNA) from 
thousands of genes (often the entire genome of an organism) in a single expt. A 



crucial drawback of microarray expts. is that results are inherently qual.: data are 
generally neither quant, repeatable, nor may microarray spot intensities be calibrated 
to in vivo mRNA concns. Nevertheless, microarrays represent by the far the cheapest 
and fastest way to obtain information about a cell's global genetic regulatory networks. 
Besides poor signal characteristics, the massive no. of data produced by microarray 
expts. pose challenges for visualization, interpretation and model building. Towards 
initial model development, we have developed a Java tool for visualizing the spatial 
organization of gene expression in bacteria. We are also developing an approach to 
inferring and testing qual. fuzzy logic models of gene regulation using microarray data. 
Because we are developing and testing qual. hypotheses that do not require quant, 
precision, our statistical evaluation of exptl. data is limited to checking for validity and 
consistency. Our goals are to maximize the impact of inexpensive microarray technol., 
bearing in mind that biol. models and hypotheses are typically qual. 
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AB Chromatin immunopptn. followed by cDNA microarray hybridization (ChlP-array) 
has become a popular procedure for studying genome-wide protein-DNA interactions 
and transcription regulation. However, it can only map the probable protein-DNA 
interaction loci within 1-2 kilobases resoln. To pinpoint interaction sites down to the 
base-pair level, we introduce a computational method, Motif Discovery scan (MDscan), 
that examines the Ch IP-array-selected sequences and searches for DNA sequence 
motifs representing the protein-DNA interaction sites. MDscan combines the 
advantages of two widely adopted motif search strategies, word enumeration and 
position-sp. wt. matrix updating, and incorporates the ChlP-array ranking information 
to accelerate searches and enhance their success rates. MDscan correctly identified all 
the exptl. verified motifs from published ChlP-array expts. in yeast (STE12, GAL4, 
RAP1, SCB, MCB, MCM1, SFF, and SWI5), and predicted two motif patterns for the 
differential binding of Rapl protein in telomere regions. In our studies, the method 
was faster and more accurate than several established motif-finding algorithms. 
MDscan can be used to find DNA motifs not only in ChlP-array expts. but also in other 
expts. in which a subgroup of the sequences can be inferred to contain relatively 
abundant motif sites. The MDscan web server can be accessed at 
http://BioProspector.stanford.edu/MDs can/. 
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AB The computer program ScreenBase based on FileMaker 5.5 is described. The 
software was developed to evaluate high-throughput data of the microarray technique. 
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AB Arrayplot is an application which allows filtering, visualization and normalization of 
raw cDNA microarray data. 
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AB The printed feature size from a quill pin microarraying system was characterized to 
predict optimal microarray d. from common exprJ. variables of pin size, soln. viscosity, 
and surface wettability. Features contg. fluorescent dye were printed from two solvent 
systems, glycerol in water and sucrose in water, and obsd. over a wide range of solute 
concns. and substrate wettabilities. Obsd. feature spreading was used to generate 
spreading diagrams that predict printed microarray feature dimensions from the water 
contact angle of the substrate, the size of the printing pin, and the viscosity (or wt % 
solute) of the printing buffer. In general, feature size was obsd. to increase with 
substrate wettability and soln. viscosity. A simple model was developed to predict 
feature d. as a function of the above variables. 
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AB A review. The use of high d. DNA microarray is crit. for processing a large no. of 
samples for DNA genome-wide anal. This review paper discussed technol. 
implementation necessary for supporting such high d. DNA microarray systems. 
Development of the accurate microarray scanner with a dynamic auto-focus system for 
homogeneous scanning and the laser irradn. system with automatic laser intensity 
controller were described. Development of a ***computer*** software for 
quantitating ***microarray*** signals was also described regarding the topics on 
spot detn. by gridding, pixel cutting and background correction. 
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AB In order to discover global gene expression patterns characterizing subgroups of 
colon cancer, microarrays were hybridized to labeled RNAs obtained from seventeen 
colonic specimens (nine carcinomas and eight normal samples). Using a hierarchical 
agglomerative method, the samples grouped naturally into two major dusters, in 
perfect concordance with pathol. reports (colon cancer vs. normal colon). Using a 
variant of the unpaired t-test, selected genes were ordered according to an index of 
importance. In order to confirm microarray data, we performed quant., real-time 
reverse transcriptase- polymerase chain reaction (TaqMan RT-PCR) on RNAs from 13 
colorectal tumors and 13 normal tissues (seven of which were matched normal-tumor 
pairs). RT-PCR was performed on the grol, B-factor, adlican, and endothelin 
converting enzyme-1 genes and confirmed microarray findings. Two hundred and fifty 
genes were identified, some of which were previously reported as being involved in 
colon cancer. We conclude that cDNA microarraying, combined with bioinformatics 
tools, can accurately classify colon specimens according to current histopathol. 
taxonomy. Moreover, this technol. holds promise of providing invaluable insight into 
specific gene roles in the development and progression of colon cancer. Our data 
suggests that a large-scale approach may be undertaken with the purpose of 
identifying biomarkers relevant to cancer progression. 
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AB A web-based visualization toolkit, called XPAK (expression Anal. Kit), has been 
developed which supports several primitive visualization toolkits including interactive 
comparison of microarray data. XPAK enables the rapid evaluation of exptJ. results and 
the construction of web sites for publication. It consists of a database storing 
fundamental descriptions of all genes on the target organism and microarray 
expression data, and a web-based front-end system for data anal, and visualization. 
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AB A network-based array analyses data interpreter was developed using the 
GeneSpring com. software package and the Cell Signaling Networks Database 
(CSNDB). The data interpreter has been used to analyze several DNA chips. 
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AB The set of tools that comprise GeneSpring as a means to survey current methods 
of anal, is presented. GeneSpring comes with an intuitive interface incorporating 
organized file management, handles data from multiple array formats, includes multiple 
data display formats, a suite of statistical clustering tools, and incorporates automated 
annotation and cross-referencing. The GeneSpring interface is const, among the 
Windows, Macintosh and UNIX platforms on which it is available. 
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AB This paper describes a new framework for microarray gene-expression data 
clustering. The foundation of this framework is a min. spanning tree (MST) 
representation of a set of multi-dimensional gene expression data. A key property of 
this representation is that each cluster of the expression data corresponds to one 
subtree of the MST, which, rigorously converts a multi-dimensional clustering problem 
to a tree partitioning problem. We have demonstrated that though the inter-data 
relationship is greatly simplified in the MST representation, no essential information is 
lost for the purpose of clustering. Two key advantages in representing a set of multi- 
dimensional data as an MST are: (1) the simple structure of a tree facilitates efficient 
implementations of rigorous clustering algorithms, which otherwise are highly 
computationally challenging; and (2) as an MST-based clustering does not depend on 
detailed geometric shape of a cluster, it can overcome many of the problems faced by 
classical clustering algorithms. Based on the MST representation, we have developed a 
no. of rigorous and efficient clustering algorithms, including two with guaranteed global 
optimality. We have implemented these algorithms as a computer software 
EXCAVATOR. To demonstrate its effectiveness, we have tested it on two data sets, 
i.e., expression data from yeast Saccharomyces cerevisiae, and Arabidopsis expression 
data in response to chitin elicitation. 

RE.CNT 15 THERE ARE 15 CITED REFERENCES AVAILABLE FOR THIS RECORD 
ALL CITATIONS AVAILABLE IN THE RE FORMAT 

L6 ANSWER 280 OF 407 CAPLUS COPYRIGHT 2006 ACS on STN 
AN 2002:489838 CAPLUS 
DN 138:169511 

TI Efficient evaluation of microarrays 
AU Werenskiold, Anne Katrin 

CS Axxima Pharmaceutical AG, Martinsried, D-82152, Germany 

SO Bioforum International (2002), 6(3), 139-141 CODEN: BINTFQ; ISSN: 1434-2693 

PB GIT Verlag GmbH 

DT Journal 

LA English 

AB The increasing use of microarrays has brought about a massive increase in the 
large vols, of data generated in mol. biol. labs, over the past few years. The software 
package "Screen-Base", based on "File-Maker Pro" as the development system, is 
making headway through the data jungle, developed by the Hamburg company Busch- 
EDV under contract to the Martinsried biotechnol. company Axxima Pharmaceuticals. 
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AB One of the challenges to the effective utilization of cDNA microarray anal, in 
mouse models of oncogenesis is the choice of a crit. set of probes that are informative 
for human disease. Given the thousands of genes with a potential role in human 
oncogenesis and the hundreds of thousands of mouse sequences available for use as 
probes, selection of an informative set of mouse probes can be an overwhelming task. 
We have developed a web based sequence mining tool using DataBase Independent 
(DBI) Perl to annotate publicly available sequences. The Mouse Oncochip Design Tool 
uses the Mouse Genome Database (MGD) developed and maintained by the Jackson 
Labs, for mouse DNA sequences. There are over 380 000 sequences in their database. 
The output list has been ordered to present the genes more likely to be informative in 
a mouse model of human cancer using a candidate set of oncogenes to order the list. 
Mouse sequences that represent genes that are homologous with a member of a 
human oncogene set are listed first. In addn. it provides a set of links for information 
on clone source gene function. 
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JP, KE, KG, KR, KZ, LC, LK, LR, LS, LT, LU, LV, MA, MD, MG, MK, MN, MW, MX, 
MZ, NO, NZ, OM, PH, PL, PT, RO, RU, SD, SE, SG, SI, SK, SL, TJ, TM, TN, TR, TT, 
TZ, UA, UG, US, UZ, VN, YU, ZA, ZM, ZW RW: GH, GM, KE, LS, MW, MZ, SD, 
SL, SZ, TZ, UG, ZM, ZW, AT, BE, CH, CY, DE, DK, ES, FI, FR, GB, GR, IE, IT, LU, 
MC, NL, PT, SE, TR, BF, BJ, CF, CG, CI, CM, GA, GN, GQ, GW, ML, MR, NE, SN, 
TD, TGAU 2002021104 A5 20020624 AU 2002-21104 20011210 
PRAI JP 2000-375381 A 20001211 WO 2001-JP10780 W 20011210 
AB A method of detecting a correlation between genes involving the step wherein a 
partial correlation coeff. is approx. detd. with the use of regression anal, in assocn. 
with selection of variables to thereby eliminate effects of an arbitrary third gene on a 
first and second genes among a large no. of genes. Thus, the relation between the 
first and second genes can be found out without affected by any other gene. This 
method is useful in analyzing the expression profile of genes obtained by DNA 
microarrays. Anal, of correlations between 44 genes whose expression differs between 
cancer and normal cells is described. 
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AB The arraying of samples is introduced on application-specific carriers using the 
contact (simultaneous pipetting) or non-contact spotting procedure. The software for 
the controlling of the microspotter system is described. 
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AB Microarrays are extensively used in mol. biol. expts. While several vendors offer 
microarrays on a variety of platforms, many researchers prefer to use custom 
microarrays with a selected list of dones for their expts. Many research centers have 
established core facilities for the prodn. of custom microarrays. Microarray prodn. 
involves a no. of steps, including maintaining a master list of stock dones, selecting 
required dones for custom microarrays, subculturing selected dones, amplifying 
inserts, recording results, and identifying the orientation of dones in the microarray. 
The authors have created a simple, user-friendly, and versatile Microsoft Excel 
spreadsheet-based software, Microarray Assistant, which can assist the user in all the 
steps of microarray design and synthesis. In addn., the program gives options to 
insert, delete, or interchange dones during various steps. The program also gives a 
visual picture of the locations of the clones in the plates, as well as in the microarray. 
The program can also be used to assist in the transfer of clones between plates of 
different configuration. 

RE.CNT 2 THERE ARE 2 CITED REFERENCES AVAILABLE FOR THIS RECORD 
ALL CITATIONS AVAILABLE IN THE RE FORMAT 

L6 ANSWER 285 OF 407 CAPLUS COPYRIGHT 2006 ACS on STN 
AN 2002:448982 CAPLUS 
DN 137:227135 

TI Automatic quantitation of hybridization signals on cDNA arrays 

AU Tahi, F.; Achddou, B.; Decraene, C; Alibert, O.; Guiot, H.; Auffray, C; Pietu, G. 

CS CNRS FRE 2376, Genexpress, Villejuif, Fr. 

SO BioTechniques (2002), 32(6), 1386-1388, 1390, 1392, 1394, 1396-1397 CODEN: 
BTNQDO; ISSN: 0736-6205 
PB Eaton Publishing Co. 
DT Journal 
LA English 

AB Large-scale hybridization of simple or complex cDNA probes to cDNA dones 
arrayed on high-d. filters is a method frequently used to det. systematically the 
expression profiles of thousands of genes. Hybridization signal intensities, which reflect 
the level of transcription of the corresponding genes, are captured on phosphor screens 
with an imaging system. We describe a high-throughput system, Xdots-Reader, that 
performs automatic detection and quantitation of each signal on hundreds of images. 
Reproducibility of spot detection and quantitation within filters and between filters has 
been assessed in anal, of more than 850 000 hybridization signals on 436 filters. The 
automatic anal, success was greater than 97%, with 424 of the 436 tested filters fully 
analyzed without any human intervention. It runs on SUN workstations under UNIX 
(SunOS or Solaris) and on PC under LINUX. No particular hardware is required, and 
the software is compatible with any other software. It supports the main std. image 
formats. 
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AB Here we present a methodol. for the normalization of element signal intensities to 
a mean intensity calcd. locally across the surface of a DNA microarray. These methods 
allow the detection and/or correction of spatially systematic artifacts in microarray 
data. These include artifacts that can be introduced during the robotic printing, 
hybridization, washing, or imaging of microarrays. Using array element signal 
intensities alone, this local mean normalization process can correct for such artifacts 
because they vary across the surface of the array. The local mean normalization can 
be used for quality control and data correction purposes in the anal, of microarray data. 
These algorithms assume that array elements are not spatially ordered with regard to 
sequence or biol. function and require that this spatial mapping is identical between the 
two sets of intensities to be compared. The tool described in this report was developed 
in the R statistical language and is freely available on the Internet as part of a larger 
gene expression anal, package. This Web implementation is interactive and user- 
friendly and allows the easy use of the local mean normalization tool described here, 
without programming expertise or downloading of addnl. software. 
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AB A compact assay system having a solid support has at least one capture binding 
agent on the support surface. By applying a combination of different binding agents on 
the support surface, the present invention can conduct multiple chem. reactions on the 
support solid support to detect analytes of interest. The specific reagents, or capture 
binding agents, are preferably immobilized on the solid support by means of a 
computer controlled, miniaturized printing system. Specifically, the reagents can be 
applied onto the solid support using a com. available printhead of an ink-jet printer. In 
addn., the support surface also includes areas adapted to be digitally readable by laser 
to store information concerning binding between capture agents and analytes. The 
assay system is useful as a sample array holder for performing a variety of chem. 
analyses, such as matrix assisted laser desorption ionization mass spectrometry 
analyses. 
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AB We report here the identification of a previously unknown transcription regulatory 
element for heat shock (HS) genes in Caenorhabditis elegans. We monitored the 
expression pattern of 11,917 genes from C. elegans to det. the genes that were up- 
regulated on HS. Twenty eight genes were obsd. to be consistently up-regulated in 
several different repetitions of the expts. We analyzed the upstream regions of these 
genes using computational DNA pattern recognition methods. Two potential cis- 
regulatory motifs were identified in this way. One of these motifs (TTCTAGAA) was the 
DNA binding motif for the heat shock factor (HSF), whereas the other (GGGTGTC) was 
previously unreported in the literature. We detd. the significance of these motifs for the 
HS genes using different statistical tests and parameters. Comparative sequence anal, 
of orthologous HS genes from C. elegans and Caenorhabditis briggsae indicated that 
the identified DNA regulatory motifs are conserved across related species. The role of 
the identified DNA sites in regulation of HS genes was tested by in vitro mutagenesis of 
a green fluorescent protein (GFP) reporter transgene driven by the C. elegans hsp-16-2 
promoter. DNA sites corresponding to both motifs are shown to play a significant role 
in up-regulation of the hsp-16-2 gene on HS. This is one of the rare instances in which 
a novel regulatory element, identified using computational methods, is shown to be 
biol. active. The contributions of individual sites toward induction of transcription on 
HS are nonadditive, which indicates interaction and cross-talk between the sites, 
possibly through the transcription factors (TFs) binding to these sites. 
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AB A review. Concepts and results are described for the use of a single, but 
extremely flexible, probing tool to address a wide variety of genomic questions. This is 
achieved by transforming genomic questions into a software file that is used as the 
design scheme for potentially any genomic assay in a microarray format. Microarray 
fabrication takes place in three-dimensional microchannel reaction carriers by in situ 
synthesis based on spatial light modulation. This set-up allows for max. flexibility in 
design and realization of genomic assays. Flexibility is achieved at the mol., genomic 
and assay levels. We have applied this technol. to expression profiling and genotyping 
expts. 
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AB Gene array studies can assess the global expression patterns of thousands of 
genes under multiple conditions. This technol. can provide important insights about 
the underlying genetic causes of many important biol. questions, and can change our 
understanding of diseases, ultimately allowing the development of novel chem. entities 
as potential drug candidates. The informatics anal, and integration of gene expression 
pattern are crit. for interpreting gene array studies. In this paper, we discuss the 
computational anal, of three important tasks: (1) the identification of differentially 
expressed genes, (2) the discovery of gene clusters, and (3) the classification of biol. 
samples. In addn., we discuss how gene sequence and chem. structures can be 
profitably combined with microarray studies. Detailed examples are given throughout. 
Programs written in open source R language for achieving each of these tasks are 
freely available at gila.engr.uic.edu/genex. 
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AB The authors propose a non-contact spotting concept for split-pin based 
microarrays utilizing dynamic control of the trajectory of the split-pin. Numerical 
simulation demonstrates that this novel method not only avoids the necessity of the pin 
tip striking the surface of the substrate, but also offers a new mechanism to realize 
spot-vol. -on-demand, as well as enhance the uniformity of sample spots. 
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AB A system and corresponding method analyzes biol. data for sets of test subjects 
such as gene arrays of group test subjects into clusters and order the clusters into a 
hierarchy based on similarities and differences of biol. data corresponding to the test 
subjects. A combination of nonhieranchical clustering and hierarchical clustering 
methods is used to efficiently and effectively perform hierarchical clustering of such 
biol. data as highly dense gene arrays contg. many thousand test subjects such as 
genes. First the test subjects are nonhierarchically clustered according to similarities 
and differences of their biol. data as detd. by distance techniques. Representative 
values, such as mean values, of the biol. data are detd. for each nonhierarchical cluster 
of test subjects. These representative values are then used to hierarchically cluster the 
nonhierarchical dusters. Biol. data for each test subject is displayed in a row of a 
table. The rows of the table are arranged by the nonhierarchical clustering and further 
by the hierarchical clustering. Each value of the biol. data is color coded according to 
its value to display patterns in the hierarchically clustered biol. data. 
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AB The invention concerns the optimization of mol. multiplex diagnostics of tissue 
♦♦♦microarrays*** using Virtual Cell Nucleus Imaging (VIRNI); ***computer*** 
modeling is used to establish microscopic detectable marker locations in the cells. The 
method can be applied in tumor diagnosis. 
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DK, DM, DZ, EE, ES, FI, GB, GD, GE, GH, GM, HR, HU, ID, IL, IN, IS, JP, KE, KG, 
KP, KR, KZ, LC, LK, LR, LS, LT, LU, LV, MA, MD, MG, MK, MN, MW, MX, MZ, NO, 
NZ, PL, PT, RO, RU, SD, SE, SG, SI, SK, SL, TJ, TM, TR, TT, TZ, UA, UG, US, UZ, 
VN, YU, ZA, ZW, AM, AZ, BY, KG, KZ, MD, RU, TJ, TM RW: GH, GM, KE, LS, 
MW, MZ, SD, SL, SZ, TZ, UG, ZW, AT, BE, CH, CY, DE, DK, ES, FI, FR, GB, GR, IE, 
IT, LU, MC, NL, PT, SE, BF, BJ, CF, CG, CI, CM, GA, GN, GW, ML, MR, NE, SN, TD, 
TG AU 2000079678 A5 20020429 AU 2000-79678 20001018 
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AB The present invention relates to a digital DNA chip of oligonucleotides probes on a 
solid support for detecting point mutations in a DNA sample. The array is composed of 
a labeling part and a logic part. The labeling part of arrays includes catalog no., gene 
sequence no., ID no., command, and IP address (which indicates the information for 
sample DNA identification to be read by computer). Catalog no. represents the kind of 
chip constructed. Gene sequence no. reports the name of the gene with accession no. 
ID no. identifies the subject or patient being tested. Command directs a short message 
which shows urgent information. The genetic information is stored at the IP address. 
The labeling part may be constructed in the form of a bar code for digital recognition 
by computer. The logic part includes arrays of probes in 4 columns and at least 100 
and up to a 100,000 rows. Each column consists of 2 symbols that include a control 
symbol (having detectable marker for digital recognition by computer) and a 
hybridization symbol. The hybridization symbol comprises oligonucleotide probes that 
are 5 to 30 nucleotides in length and that occupy known sites by substituting target 
oligonucleotide into A C G T in each column. A point mutation is present in the 
subject's DNA if both control and hybridization symbol are detected, whereas detection 
of only hybridization symbol indicates normal DNA. Thus, after hybridization the 
genetic information becomes apparent on the chip. Since the information is 
represented by discrete digital information, the information can be read and translated 
using a mech. device into digital information without analog to digital conversion. In 
another embodiment the device may be in the form of a CD-ROM for convenient 
recognition by computer. The information regarding the labeling and logic part may be 
stored by computer or transmitted to other researchers via the internet. 
RE.CNT 3 THERE ARE 3 CITED REFERENCES AVAILABLE FOR THIS RECORD 
ALL CITATIONS AVAILABLE IN THE RE FORMAT 

L6 ANSWER 297 OF 407 CAPLUS COPYRIGHT 2006 ACS on STN 
AN 2002:297584 CAPLUS 
DN 138:35505 
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AB Micro total anal, systems (.mu.TAS) have been developed to perform a no. of anal, 
processes involving chem. reactions, sepn. and sensing on a single chip. In medical 
and biomedical applications, .mu.TAS must be designed considering special transport 
mechanisms to move samples and reagents through the microchannels in the system. 
For conventional micro pumps, however, complicated relationships exist between the 
pumping mechanisms, the conditions under which the devices operate and the 
behavior of the multi-component fluids transported in these channels. A bi-directional 
microfluid driving system has been developed in this paper. This pneumatic system is 
an on-chip planar structure with no moving parts and does not require microfabricated 



heaters or electrodes. The pumping actuation is introduced to the microchannel 
fabricated in the chip by blowing an airflow through this device. The bi-directional 
driving module combines two individual components for suction and exclusion. The 
driving system provides a stable and flexible bi-directional microfluid driving control. 
The tunable parameters for adjusting the exdusion/suction ratios, sucn as the location 
of the inlet channel and the velocities of the airflow, have been obsd. in the numerical 
study. The optimal exdusion/suction ratio for the spedfic purpose of the driving 
system can be selected by changing the location of the microchannel to the reaction 
area for the sample/reagent. The velocity at the microchannel can be adjusted by 
varying the inlet velocities for the suction and exclusion components. For the 
presented design, no air conduit was employed to connect the servo-system to the 
driving system; therefore the packaging difficulty and leakage problem, which may 
arise in conventional systems, can be eliminated. The final airflow outlet was fixed in 
one direction so that it can prevent cross-contamination between the servo-system and 
the chip. The driving system is therefore particularly suited to microdevices for 
biochem. anal. Diagrams describing the app. assembly and operation are given. 
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AB OligoArray is a program that computes gene specific and secondary structure free 
oligonucleotides for genome-scale oligonucleotide microarray construction or other 
applications. 
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AB Motivation: The existence of several technologies for measuring gene expression 
makes the question of cross-technol. agreement of measurements an important issue. 
Cross-platform utilization of data from different technologies has the potential to 
reduce the need to duplicate expts. but requires corresponding measurements to be 
comparable. Methods: A comparison of mRNA measurements of 2895 sequence- 
matched genes in 56 cell lines from the std. panel of 60 cancer cell lines from the 
National Cancer Institute (NCI 60) was carried out by calcg. correlation between 
matched measurements and calcg. concordance between cluster from two high- 
throughput DNA microarray technologies, Stanford type cDNA microarrays and 
Affymetrix oligonucleotide microarrays. Results: In general, corresponding 
measurements from the two platforms showed poor correlation. Clusters of genes and 
cell lines were discordant between the two technologies, suggesting that relative intra- 
technol. relationships were not preserved. GC-content, sequence length, av. signal 
intensity, and an estimator of cross-hybridization were found to be assocd. with the 
degree of correlation. This suggests gene-specific, or more correctly probe-specific, 
factors influencing measurements differently in the two platforms, implying a poor 
prognosis for a broad utilization of gene expression measurements across platforms. 
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AB A computer based system for supporting the diagnosis of current or future 
diseases is disclosed. A database of DNA microarray results for gene expression 
patterns is compared with disease assocn. database to predict diseases likely to occur, 
and DNA microarray database is compared to genetic information database to update 
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the disease assocn. database. The genetic information database contains information 
about the patients' family background, din. results, life styles, and prescription, etc. 
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AB Background: Microarray technologies are emerging as a promising tool for 
genomic studies. The challenge now is how to analyze the resulting large amts. of 
data. Clustering techniques have been widely applied in analyzing microarray gene- 
expression data. However, normal mixt. model-based cluster anal, has not been widely 
used for such data, although it has a solid probabilistic foundation. Here, we introduce 
and illustrate its use in detecting differentially expressed genes. In particular, we do 
not cluster gene-expression patterns but a summary statistic, the t-statistic. Results: 
The method is applied to a data set contg. expression levels of 1176 genes of rats with 
and without pneumococcal middle-ear infection. Three dusters were found, two of 
which contain more than 95% genes with almost no altered gene-expression levels, 
whereas the third one has 30 genes with more or less differential gene-expression 
levels. Conclusions: Our results indicate that model-based dustering of t-statistics (and 
possibly other summary statistics) can be a useful statistical tool to exploit differential 
gene expression for microarray data. 
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AB Background: We propose two different formulations of the Rasch statistical models 
to the problem of relating gene expression profiles to the phenotypes. One formulation 
allows us to investigate whether a cluster of genes with similar expression profiles is 
related to the obsd. phenotypes; this model can also be used for future prediction. The 
other formulation provides an alternative way of identifying genes that are over- or 
underexpressed from their expression levels in tissue or cell samples of a given tissue 
or cell type. Results: We illustrate the methods on available datasets of a dassification 
of acute leukemias and of 60 cancer cell lines. For tumor classification, the results are 
comparable to those previously obtained. For the cancer eel) lines dataset, we found 
four clusters of genes that are related to drug response for many of the 90 drugs that 
we considered. In addn., for each type of cell line, we identified genes that are over- 
or underexpressed relative to other genes. Conclusions: The cluster- Rasch model 
provides a probabilistic model for describing gene expression patterns across samples 
and can be used to relate gene expression profiles to phenotypes. 
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AB Most microarray scanning software for glass spotted arrays provides ests. for the 
intensity for the "foreground" and "background" of two channels for every spot. The 
common approach in further analyzing such data is to first subtract the background 
from the foreground for each channel and to use the ratio of these two results as the 
est. of the expression level. The resulting ratios are, after possible averaging over 
replicates, the usual inputs for further data anal., such as clustering. If, with this 
background correction procedure, the foreground intensity was smaller than the 
background intensity for a channel, that spot (on that array) yields no usable data. In 
this paper it is argued that this preprocessing leads to ests. of the expression that have 
a much larger variance than needed when the expression levels are low. 
RE.CNT 11 THERE ARE 11 CITED REFERENCES AVAILABLE FOR THIS RECORD 
ALL CITATIONS AVAILABLE IN THE RE FORMAT 

L6 ANSWER 304 OF 407 CAPLUS COPYRIGHT 2006 ACS on STN 
AN 2002:224142 CAPLUS 
DN 137:226986 



TI Mining mouse microarray data 

AU Wigle, Dennis A.; Rossant, Janet; Jurisica, Igor 

CS Dep. Medical Genetics Microbiology, Univ. Toronto, Toronto, ON, M5S 1A8, Can. 
SO GenomeBiology [online computer file] (2001), 2(7), No pp. given CODEN: 
GNBLFW; ISSN: 1465-6914 
PB BioMed Central Ltd. 

DT Journal; General Review; (online computer file) 
LA English 

AB A review. Microarrays of mouse genes are now available from several sources, 
and they have so far given new insights into gene expression in embryonic 
development, regions of the brain and during apoptosis. Microarray data posted on the 
internet can be reanalyzed to study a range of question. 
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AB Clin. GeneOrganizer (CGO) is a novel windows-based archiving, organization and 
data mining software for the integration of gene expression profiling in din. medicine. 
The program implements various user-friendly tools and exts. data for further statistical 
anal. This software was written for Affymetrix GeneChip *.txt files, but can also be 
used for any other microarray-derived data. The MS-SQL server version acts as a data 
mart and links microarray data with din. parameters of any other existing database and 
therefore represents a valuable tool for combining gene expression anal, and din. 
disease characteristics. 
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AB The DRAGON View information visualization tools aid in the comprehensive anal, 
of large-scale gene expression data that has been annotated with biol. relevant 
information through the generation of three types of complementary graphical outputs. 
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AB Motivation: Hierarchical dustering is one of the major anal, tools for gene 
expression data from microarray expts. A major problem in the interpretation of the 
output from these procedures is assessing the reliability of the dustering results. We 
address this issue by developing a mixt. model-based approach for the anal, of 
microarray data. Within this framework, we present novel algorithms for clustering 
genes and samples. One of the byproducts of our method is a probabilistic measure 
for the no. of true clusters in the data. Results: The proposed methods are illustrated 
by application to microarray datasets from two cancer studies; one in which malignant 
melanoma is profiled, and the other in which prostate cancer is profiled. 
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AB Motivation: Existing analyses of microarray data often incorporate an obscure data 
normalization procedure applied prior to data anal. For example, ratios of microarray 
channels intensities are normalized to have common mean over the set of genes. We 
made an attempt to understand the meaning of such procedures from the modeling 
point of view, and to formulate the model assumptions that underlie them. Given a 
considerable diversity of data adjustment procedures, the question of their 
performance, comparison and ranking for various microarray expts. was of interest. 
Results: A two-step statistical procedure is proposed: data transformation (adjustment 
for slide-specific effect) followed by a statistical test applied to transformed data. 
Various methods of anal, for differential expression are compared using simulations and 
real data on colon cancer cell lines. We found that robust categorical adjustments 
outperform the ones based on a precisely defined stochastic model, including some 
commonly used procedures. 
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AB A review describing the practical implementation of a microarray facility using the 
fabrication of Candida albicans microarrays as an example. It is a feasible and cost- 
effective soln. for even small labs, to set up to produce their own high-quality 
microarrays for the genome of any species. The keys to this flexibility are the ability to 
synthesize large nos. of high-quality oligonucleotides in a cost-effective way and to 
have an integrated informatics platform to track samples and follow them through the 
quality control steps. Wider impact of microarray technol. is limited by the cost of the 
currently available microarrays and the relatively few species for which arrays are 
available. 
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AB A key assumption in the anal, of microarray data is that the quantified signal 
intensities are linearly related to the expression levels of the corresponding genes. To 
test this assumption, the authors exptl. examd. the relationship between signal and 
expression for the two types of microarrays they most commonly encounter: 
radioactively labeled cDNAs on nylon membranes and fluorescently labeled cDNAs on 
glass slides. Two sources of nonlinearity were recovered. The first, which led to 
discrepancies in anal, affecting the fluorescent signals, was signal quenching assocd. 
with excessive dye conens. The second, affecting the radioactive signals, was a 
nonlinear transformation of the raw data introduced by the scanner. Correction for this 
transformation was made by some, but not all, image-quantification software packages. 
The second type of nonlinearity is more troublesome, because it could not have been 
predicted a priori. Both types of nonlinearities were detected by simple diln. series, 
which the authors recommend as a quality-control step. 
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AB The ability to investigate the transcription of thousands of genes concurrently by 
using DNA microarrays offers both major scientific opportunities and significant anal, 
challenges. Here we describe GABRIEL, a rule-based system of computer programs 
designed to apply domain-specific and procedural knowledge systematically and 
uniformly for the anal, and interpretation of data from DNA microarrays. GABRIEL'S 
problem-solving rules direct stereotypical tasks, whereas domain-specific knowledge 



pertains to gene functions and relationships or to exptl. conditions. Addnl., GABRIEL 
can learn novel rules through genetic algorithms, which define patterns that best match 
the data being analyzed and can identify groupings in gene expression profiles 
preordered by chromosomal position or by a nonsupervised algorithm such as 
hierarchical clustering. GABRIEL subsystems explain the logic that underlies 
conclusions and provide a graphical interface and interactive platform for the 
acquisition of new knowledge. The present report compares GABRIEL'S output with 
published findings in which expert knowledge has been applied post hoc to microarray 
groupings generated by hierarchical clustering. 
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AB We describe here the development of a carbohydrate-based microarray to extend 
the scope of biomedical research on carbohydrate- mediated mol. recognition and anti- 
infection responses. We have demonstrated that microbial polysaccharides an be 
immobilized on a surface-modified glass slide without chem. conjugation. With this 
procedure, a large repertoire of microbial antigens (.apprx. 20,000 spots) can be 
patterned on a single micro-glass slide, reaching the capacity to include most common 
pathogens. Glycoconjugates of different structural characteristics are shown here to be 
applicable for microarray fabrication, extending the repertoires of diversity and 
complexity of carbohydrate microarrays. The printed microarrays can be air-dired and 
stably stored at room temp, for long periods of time. In addn., the system is highly 
sensitive, allowing simultaneous detection of a broad spectrum of antibody specificities 
with as little as a few microliters of serum specimen. Finally, the potential of 
carbohydrate microarrays is demonstrated by the discovery of previously undescribed 
cellular markers, Dex-lds. 
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AB An user authentication system is provided, in which DNA is used for the user 
authentication in an electronic information exchange or an electronic trade, and 
thereby, the authentication is rapidly performed with high security. In this user 
authentication system for recognizing a normal user, an authentication card possessing 
a DNA microarray is used, on which the hybridization pattern is formed by reacting a 
DNA array with an user's DNA. The hybridization pattern is read from the DNA 
microarray of the authentication card by a scanner, and the information for an 
authentication registration or an authentication is sent to a computer at the contracter 
side. With the computer at the contracter side, the user registration is carried out 
according to the information transmitted, and the user authentication is performed by 
comparing the hybridization pattern shown by the transmitted information with the 
registered hybridization pattern. 
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AB A system and method to predict alternative splicing transcripts using DNA chip 
expression data as a primary data source are disclosed. The system and method may 
perform prediction of alternative splicing of pre-mRNA that may be used, for example, 
for regulating eukaryotic gene expression. 
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AB On the basis of CHG theory, we derive an equation for the radial distribution of the 
deposit concn. and discuss some of its properties. We give a simple criterion for the 
suppression of ring formation, applicable to the data of earlier expts. Second, we 
provide evidence for the convection-dominated deposit formation process by explaining 
a paradox of virtually identical deposits found. 
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TI Genesis: cluster analysis of microarray data 
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AB Summary: A versatile, platform independent and easy to use Java suite for large- 
scale gene expression anal, was developed. Genesis integrates various tools for 
microarray data anal, such as filters, normalization and visualization tools, distance 
measures as well as common clustering algorithms including hierarchical clustering, 
self-organizing maps, k-means, principal component anal., and support vector 
machines. The results of the clustering are transparent across all implemented 
methods and enable the anal, of the outcome of different algorithms and parameters. 
Addnl., mapping of gene expression data onto chromosomal sequences was 
implemented to enhance promoter anal, and investigation of transcriptional control 
mechanisms. 
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data 
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AB Summary: Cluster Identification Tool (CIT) is a microarray anal, program that 
identifies differentially expressed genes. Following division of expo*, samples based on 
a parameter of interest, CIT uses a statistical discrimination metric and permutation 
anal, to identify clusters of genes or individual genes that best differentiate between 
the exptl. groups. OT integrates with the freely available CLUSTER and TREEVIEW 
programs to form a more complete microarray anal, package. 
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AB We consider array expts. that compare expression levels of a high no. of genes in 
two cell lines with few repetitions and with no subject effect. We develop a statistical 
model that illustrates under which assumptions thresholding is optimal in the anal, of 
such microarray data. The results of our model explain the success of the empirical 
rule of two-fold change. We illustrate a thresholding procedure that is adaptive to the 
noise level of the expt, the amt. of genes analyzed, and the amt. of genes that truly 
change expression level. This procedure, in a world of perfect knowledge on noise 
distribution, would allow reconstruction of a sparse signal, minimizing the false 
discovery rate. Given the amt. of information actually available, the thresholding rule 
described provides a reasonable estimator for the change in expression of any gene in 
two compared cell lines. 
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TI Gene chips and computer system for obtaining genetic information necessary for 

prescription of genome medicine 
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AB Gene chips contg. probes for obtaining genetic information necessary for 
prescription of genome medicine, are disclosed. Genetic information regarding the 
concurrent use of other drugs, adverse reactions, and Pharmacol, efficacy, is obtained. 
Computer-based system and computer readable memory devices for inputing and 
analyzing hybridization patterns, are also claimed. 
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AB The present invention relates to novel systems, devices, and methods comprising 
spatial light modulators for use in the reading and synthesis of microarrays. For 
example, the present invention provides micromirror systems for synthesizing and 
acquiring data from nucleic acid microarrays and systems for collecting, processing, 
and analyzing data obtained from a microarray. 
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AB . DNA microarrays are now widely used to measure expression levels and DNA copy 
no. in biol. samples. Ratios of relative abundance of nucleic acids are derived from 
images of regular arrays of spots contg. target genetic material to which fluorescently 
labeled samples are hybridized. Whereas there are a no. of methods in use for the 
quantification of images, many of the software systems in wide use either encourage or 
require extensive human interaction at the level of individual spots on arrays. We 
present a fully automatic system for microarray image quantification. The system 
automatically locates both subarray grids and individual spots, requiring no user 
identification of any image coordinates. Ratios are computed based on explicit 
segmentation of each spot. On a typical image of 6000 spots, the entire process takes 
less than 20 s. We present a quant, assessment of performance on multiple replicates 
of genome-wide array-based comparative genomic hybridization expts. By explicitly 
identifying the pixels in each spot, the system yields more accurate ests. of ratios than 
systems assuming spot circularity. The software, called UCSF Spot, runs on Windows 
platforms and is available free of charge for academic use. 
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AB Voxelation is a new method for acquisition of three dimensional (3D) gene 
expression patterns in the brain. It employs high-throughput anal, of spatially 
registered voxels (cubes) to produce multiple volumetric maps of gene expression 
analogous to the images reconstructed in biomedical imaging systems. Using 
microarrays, 24 voxel images of coronal hemisections at the level of the hippocampus 
of both the normal human brain and Alzheimer's disease brain were acquired for 2000 
genes. The anal, revealed a common network of coregulated genes, and allowed 
identification of putative control regions. In addn., singular value decompn. (SVD), a 
math, method used to provide economical explanations of complex data sets, produced 
images that distinguished between brain structures, including cortex, caudate, and 
hippocampus. The results suggest that voxelation will be a useful approach for 
understanding how the genome constructs the brain. 
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AB The comprehensive anal, and visualization of data extd. from cDNA microarrays 
can be a time-consuming and error-prone process that becomes increasingly tedious 
with increased no. of gene elements on a particular microarray. With the increasingly 
large no. of gene elements on today's microarrays, anal, tools must be developed to 
meet this challenge. Here, the authors present MarC-V, a Microsoft Excel spreadsheet 
tool with Visual Basic macros to automate much of the visualization and calcn. involved 
in the anal, process while providing the familiarity and flexibility of Excel. Automated 
features of this tool include (i) lower-bound thresholding, (ii) data normalization, (iii) 
generation of ratio frequency distribution plots, (iv) generation of scatter plots color- 
coded by expression level, (v) ratio scoring based on intensity measurements, (vi) 
filtering of data based on expression level or specific gene interests, and (vii) exporting 
data for subsequent multi-array anal. MarC-V also has an importing function included 
for GenePix results (GPR) raw data files. 
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AB A variety of tech. errors have arisen in data anal, when using cDNA or 
oligonucleotide microarrays. One of the most insidious problems is the satn. of the 
hybridization signal of high-abundant transcripts. This problem arises from the 
truncation of the laser fluorescence signal. When the hybridization signal on the 
microarray is very strong, this truncation can result in serious consequences that may 
not be readily apparent to the user. As an illustration of this problem, two subclasses 
of normal human tissue samples (six liver and six lung samples) were analyzed with 
GeneChip probe arrays to evaluate the patterns of expression for approx. 7000 human 
genes. Five of these data sets were found to suffer from signal truncation. This 
caused several tissues to be incorrectly classified using hierarchical clustering. To 
rectify this problem so that the gene expression data could be properly compared and 
clustered, we developed a "filtering" procedure that identifies a subset of genes least 
affected by the signal satn. This filtering procedure can be obtained at 
www.hugeindex.org. 
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AB Microarrays can measure the expression of thousands of genes and thus identify 
changes in expression between different biol. states. Methods are needed to det. the 
significance of these changes, while accounting for the enormous no. of genes. We 
describe a new method, Significance Anal, of Microarrays (SAM), that assigns a score to 
each gene based on the change in gene expression relative to the std. deviation of 
repeated measurements. For genes with scores greater than an adjustable threshold, 
SAM uses permutations of the repeated measurements to est. the percentage of such 
genes identified by chance, the false discovery rate (FDR). When the transcriptional 
response of human cells to ionizing radiation was measured by microarrays, SAM 
identified 34 genes that changed at least 1.5-fold with an estd. FDR of 12%, compared 
to FDRs of 60% and 84% using conventional methods of anal. Of the 34 genes, 19 
were involved in cell cycle regulation, and 3 in apoptosis. Surprisingly, 4 nucleotide 
excision repair genes were induced, suggesting that this repair pathway for UV- 
damaged DNA might play a heretofore unrecognized role in repairing DNA damaged by 
ionizing radiation. 
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AB A review with refs. Microarrays are part of a new class of biotechnologies that 
allow the monitoring of expression levels for thousands of genes simultaneously. 
Image anal, is an important aspect of microarray expts., one that can have a 
potentially large impact on subsequent analyses, such as clustering or the identification 
of differentially expressed genes. This paper reviews a no. of existing image anal, 
methods used on cDNA microarray data. In particular, it describes and discusses the 
different segmentation and background adjustment methods. It was found that in 
some cases background adjustment can substantially reduce the precision - i.e., 
increase the variability of low-intensity spot values. In contrast, the choice of 
segmentation procedure seems to have a smaller impact. 
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AB A review on application of DNA microarrays in simultaneous investigation of 
diverse disease-assocd. gene expressions. Proteome anal., 2-dimensional get 
electrophoresis of tear protein patterns of diabetic patients and patients with Sicca 
syndrome, and data evaluation by bioinformatics using neuronal networks are 
described. 

RE.CNT 35 THERE ARE 35 CITED REFERENCES AVAILABLE FOR THIS RECORD 
ALL CITATIONS AVAILABLE IN THE RE FORMAT 

L6 ANSWER 328 OF 407 CAPLUS COPYRIGHT 2006 ACS on STN 
AN 2002:51674 CAPLUS 
DN 136:97275 

TI Manipulation of hybridizations of gene expression ***microarray*** by 
♦♦♦computer*** -implemented method 

IN Gupta, Robert P.; Choi, Kirindi V. M.; Brahms, Robert A.; Chang, Doris; Chong, 

Dairy! V. K.; Burrill, John D.; Marcus, Gregory; Bartha, Gabor T.; Head, Richard 

PA Incyte Genomics, Inc., USA 

SO PCT Int. Appl., 30 pp. CODEN: PIXXD2 

DT Patent 

LA English 

FAN.CNT 1 PATENT NO. KIND DATE APPLICATION NO. DATE 



PI WO 2002004676 A2 20020117 WO 2001 -US2 1382 20010705 W: 
AE, AG, AL, AM, AT, AU, AZ, BA, BB, BG, BR, BY, BZ, CA, CH, CN, CR, CU, CZ, DE, 
DK, DM, DZ, EE, ES, FI, GB, GD, GE, GH, GM, HR, HU, ID, IL, IN, IS, JP, KE, KG, 
KP, KR, KZ, LC, LK, LR, LS, LT, LU, LV, MA, MD, MG, MK, MN, MW, MX, MZ, NO, 
NZ, PL, PT, RO, RU, SD, SE, SG, SI, SK, SL, TJ, TM, TR, TT, TZ, UA, UG, UZ, VN, 
YU, ZA, ZW RW: GH, GM, KE, LS, MW, MZ, SD, SL, SZ, TZ, UG, ZW, AT, BE, 
CH, CY, DE, DK, ES, FI, FR, GB, GR, IE, IT, LU, MC, NL, PT, SE, TR, BF, BJ, 
CF, CG, Q, CM, GA, GN, GW, ML, MR, NE, SN, TD, TG US 6424921 Bl 
20020723 US 2000-613167 20000710 US 2002188409 Al 20021212 
US 2002-199239 20020719 
PRAI US 2000-613167 A 20000710 

AB The invention relates to computer-implemented method of averaging a plurality of 
hybridization of gene expression microarray. Composite hybridization arrays and 
averaged hybridization arrays are provided. Composite hybridization arrays are formed 
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a user selected set of hybridization arrays, and once instantiated in a hybridization 
array database, are available for searching, anal., and other data processing as with 
other types of hybridization arrays. This allows otherwise expts. using multiple 
different nucleotide microarrays to be efficiently consolidated and analyzed. Averaged 
hybridization array provide correctly averaged values from multiple user selected dual 
channel hybridization arrays. 
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AB A microarrayer for spotting soln. onto slides (4A-4E) in an automated microarray 
dispensing device. Elements of the present in vetion include: at least one dispense 
head (6) for spotting the slides, at least one light source (13) capable of illuminating 
the slides, at least one camera (12) operating in conjunction with the at least one light 
source. The at least one camera capable of acquiring and transmitting slide image 
data to a computer. The computer (300) is programmed to receive the slide image 
data and analyze it. The computer will then generate post anal, data based on the 
anal, of the slide image data. The post anal, data is available for improving the 
spotting of the soln. onto the slides. In a preferred embodiment, the slide image data 
includesidentification information. In a preferred embodiment, the anal, of the 
information relating to slide alignment enables the computer to make automatic 
adjustments to the relative positions of the at least one dispense head and the slides to 
increase the accuracy of the spotting. In a preferred embodiment, the anal, of the 
information relating to spot quality identifies a spot as pass or fail. An operator is then 
able to rework the spot. In a preferred embodiment, the anal, of the slide 
identification information enables the computer to track each slide. 
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AB A template-map design strategy for generating sets of non-interacting DNA 
oligonucleotides for applications in DNA arrays and biosensors is demonstrated. This 
strategy is used to create a set of oligonucleotides of size with length I that possess at 
least n base mismatches with the complements of all the other members in the set. 
These "DNA word" sets are denoted as nbm l-mers or l:n sets. To regularize the 
thermodn. stability of the perfectly matched hybridized DNA duplexes, the l-mers 
chosen for all the sets are required to have an approx. 50% G/C content. To achieve 
good discrimination between each DNA word in each set generated using the template- 
map strategy, it is required that n should be approx. equal to 1/2 or higher. The 
template-map strategy can be used in a straightforward manner to create DNA word 
sets for cases when I = 4k and n = 2k, where k is an integer. Specific examples of 
4k:2k sets are designed: an 8:4 set (s = 224), a 12:6 set (s = 528), a 16:8 set (s = 
960), and a 20:10 set (s = 1520). These sets are further optimized to achieve the 
narrowest possible distribution of melting temps, by selecting the best set after 
permutation of the templates and maps over all possible configurations. To 
demonstrate the viability of this methodol., a non-interacting set of four specific 6bm 
12mers have been chosen, synthesized, and used in an SPR imaging measurement of 
the hybridization adsorption onto a DNA array. The template- map strategy is also 
applied to generate DNA word sets for cases where I .noteq. 4k. In these cases, the 
creation of the maps and templates is more complicated, but possible. The templates 
and maps for three addnl. types of sets are created: (4k - l):(2k - 1), (4k + l):2k, and 
(4k - 2): (2k - 1). Specific examples are given for I = 7, 9, and 10: DNA word sets of 
7:3 (s = 224), 9:4 (s = 360), and 10:5 (s = 132). 

RE.CNT 32 THERE ARE 32 CITED REFERENCES AVAILABLE FOR THIS RECORD 
ALL CITATIONS AVAILABLE IN THE RE FORMAT 

L6 ANSWER 331 OF 407 CAPLUS COPYRIGHT 2006 ACS on STN 
AN 2002:10825 CAPLUS 



DN 136:50683 

TI Method and computer programs for processing gene expression data obtained 
from DNA chip hybridization 
IN Konishi, Tomokazu 
PA Japan 

SO PCT Int. Appl., 48 pp. CODEN: PIXXD2 
DT Patent 
LA Japanese 

FAN.CNT 1 PATENT NO. KIND DATE APPLICATION NO. DATE 



PI WO 2002001477 Al 20020103 WO 2001-JP4697 20010604 W: 
AE, AG, AL, AM, AT, AU, AZ, BA, BB, BG, BR, BY, BZ, CA, CH, CN, CO, CR, CU, 
CZ, DE, DK, DM, DZ, EC, EE, ES, FI, GB, GD, GE, GH, GM, HR, HU, ID, IL, IN, IS, 
JP, KE, KG, KP, KR, KZ, LC, LK, LR, LS, LT, LU, LV, MA, MD, MG, MK, MN, MW, 
MX, MZ, NO, NZ, PL, PT, RO, RU, SD, SE, SG, SI, SK, SL, TJ, TM, TR, TT, TZ, UA, 
UG, US, UZ, VN, YU, ZA, ZW, AM, AZ, BY, KG, KZ, MD, RU, TJ, TM RW: GH, 
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AB A method and computer programs for processing gene expression data obtained 
from DNA chip hybridization, are disclosed. It involves the steps of logarithmic 
conversion, detn. of median value, z-normalization, for computing a background value. 
Chi square anal, is also used. A background computing unit of an anal, device 
computes such a background value such that a normal probability graph based on a 
cumulative frequency ratio of a subtracted value obtained by subtracting a background 
value from the individual values indicating the signal intensities of spots arranged on a 
DNA chip may have a predetd. linearity. A logarithmic conversion value of a cor. signal 
intensity value, as prepd. by subtracting the background value from the values 
indicating the signal intensities, takes the form of a normal distribution. By 
standardizing the normal distribution, therefore, it is possible to compare the data 
which are measured from the DNA chips of the same kind or of different kinds. 
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TI Selective MOVPE of microarray waveguide for densely integrated photonic devices 

AU Sudo, Shinya; Kudo, Koji; Mori, Kazuo; Sasaki, Tatsuya 

CS Photonic and Wireless Devices Research Laboratories, System Devices and 

Fundamental Research, NEC Corporation, Otsu, 520-0833, Japan 

SO Conference Proceedings - International Conference on Indium Phosphide and 

Related Materials, 13th, Nara, Japan, May 14-18, 2001 (2001), 390-393 Publisher: 

Institute of Electrical and Electronics Engineers, New York, N. Y. CODEN: 69CCW7 

DT Conference 
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AB The authors studied the mask interference effect in selective OMVPE of a 
microarray optical waveguide. The authors studied the characteristics of waveguides 
having mask interference effects. Based on the exptl. results, the authors developed a 
method of ***simulating*** the characteristics of a ***microarray*** waveguide 
that uses the mask interference const., which depends on growth pressure. The 
simulation can account for the exptl. results under different growth pressures and it 
should be very useful for designing the microarray waveguides. In particular, the 
authors can use it to control the PL- wavelength profile of the microarray waveguide 
grown under atm. pressure, which is important for fabricating densely integrated 
photonic devices. 
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TI A design method of DNA chips for SNP analysis using self organizing maps 

AU Douzono, Hiroshi; Hara, Shigeomi; Noguchi, Yoshio 

CS Faculty of Science and Engineering Saga University, Saga, 840-8502, Japan 

SO International Joint Conference on Neural Networks, Proceedings, Washington, DC, 

United States, July 15-19, 2001 (2001), Volume 4, 2467-2471 Publisher: Institute of 

Electrical and Electronics Engineers, New York, N. Y. CODEN: 69CDDP 

DT Conference 
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AB In this paper, we introduce a design method of DNA chips using Self-Organizing 
Maps(SOM). DNA chips are powerful tools for sequencings and SNP (Single Nucleotide 
Polymorphism) analyses of DNA sequences. A DNA chip is an array of DNA probes 
which are hybridized with the compliment sub-sequences in the target sequence. 
However, conventional DNA chips are showing tendency to be comprised of longer 
probes and get larger in size to achieve a higher resoln. To shrink the size of DNA 
chips, the design is considered to be important. To solve this problem, we applied SOM 
to obtain common features of DNA sequences with small no. of probes which efficiently 
cover the target sequence with sufficient resoln. for finding the correct position of 
SNPs. We evaluated the DNA chips designed by SOM with computer simulations of 
SNP analyses changing the length of probes and size of the maps. 
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TI Assessing gene significance from cDNA microarray expression data via mixed 
models 

AU Woifinger, Russell D.; Gibson, Greg; Wolfinger, Elizabeth D.; Bennett, Lee; 
Hamadeh, Hisham; Bushel, Pierre; Afshari, Cynthia; Paules, Richard S. 
CS SAS Institute Inc., Cary, NC, 27513, USA 

SO Journal of Computational Biology (2001), 8(6), 625-637 CODEN: JCOBEM; ISSN: 
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PB Mary Ann Uebert, Inc. 
DT Journal 
LA English 

AB The detn. of a list of differentially expressed genes is a basic objective in many 
cDNA microarray expts. We present a statistical approach that allows direct control 
over the percentage of false positives in such a list and, under certain reasonable 
assumptions, improves on existing methods with respect to the percentage of false 
negatives. The method accommodates a wide variety of exptl. designs and can 
simultaneously assess significant differences between multiple types of biol. samples. 
Two interconnected mixed linear models are central to the method and provide a 
flexible means to properly account for variability both across and within genes. The 
mixed model also provides a convenient framework for evaluating the statistical power 
of any particular exptl. design and thus enables a researcher to a priori select an 
appropriate no. of replicates. We also suggest some basic graphics for visualizing lists 
of significant genes. Analyses of published expts. studying human cancer and yeast 
cells illustrate the results. 
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TI DNA microarray assay for genetic polymorphisms using scattered light detectable 
labels 

IN Bee, Gary; Kohne, David E.; Korb, Linda; Peterson, Todd; Yguerabide, Juan 
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DT Patent 
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PI WO 2001096604 A2 20011220 WO 2001 -US 1891 2 20010611 WO 
2001096604 A3 20030717 W: AE, AG, AL, AM, AT, AU, AZ, BA, BB, BG, BR, 
BY, BZ, CA, CH, CN, CO, CR, CU, CZ, DE, DK, DM, DZ, EE, ES, FT, GB, GD, GE, 
GH, GM, HR, HU, ID, IL, IN, IS, JP, KE, KG, KP, KR, KZ, LC, LK, LR, LS, LT, 
LU, LV, MA, MD, MG, MK, MN, MW, MX, MZ, NO, NZ, PL, PT, RO, RU, SD, SE, SG, 
SI, SK, SL, TJ, TM, TR, TT, TZ, UA, UG, US, UZ, VN, YU, ZA, ZW RW: GH, GM,. 
KE, LS, MW, MZ, SD, SL, SZ, TZ, UG, ZW, AM, AZ, BY, KG, KZ, MD, RU, TJ, TM, 
AT, BE, CH, CY, DE, DK, ES, FI, FR, GB, GR, IE, IT, LU, MC, NL, PT, SE, TR, BF, 
BJ, CF, CG, CI, CM, GA, GN, GW, ML, MR, NE, SN, TD, TG AU 2001075475 
A5 20011224 AU 2001-75475 20010611 US 2002127561 Al 
20020912 US 2001-880732 20010612 

PRAI US 2000-210988P P 20000612 WO 2001-US18912 W 20010611 
AB The invention provides a method for detg. the presence or absence of particular 
polymorphisms in CYP2D6 and other genes using scattered light detectable particles as 
detectable labels. The method utilizes a detection method based on the use of certain 
particles of specific compn., size, and shape and the detection and/or measurement of 
one or more of the particle's light scattering properties. The target sequences in a 
sample are bound to detectable light scattering partide, for example RLS (resonance 
light scattering) particle, which then illuminated with a light beam and the illumination 
can be detected by the human eye with less than 500 times magnification. Preferred 
RLS particles are composed of colloidal metals, preferably gold, silver, mixed gold and 
silver, or other mixed compn. particles contg. gold and or/silver. The invention 
provides convenient and sensitive detection for genetic polymorphisms, such as 
detection, insertion, and single nucleotide polymorphisms. 

L6 ANSWER 336 OF 407 CAPLUS COPYRIGHT 2006 ACS on STN 
AN 2001:904563 CAPLUS 
DN 136:17715 

TI Method and system for predicting nucleic acid hybridization thermodynamics and 

computer-readable storage medium for use therein 

IN Santalucia, John, Jr.; Peyret, Nicolas 

PA Wayne State University, USA 

SO PCT Int. Appl., 100 pp. CODEN: PIXXD2 

DT Patent 

LA English 
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PI WO 2001094611 A2 20011213 WO 2001 -US 18424 20010607 WO 
2001094611 A3 20020418 W: AE, AG, AL, AM, AT, AU, AZ, BA, BB, BG, BR, 
BY, BZ, CA, CH, CN, CO, CR, CU, CZ, DE, DK, DM, DZ, EC, EE, ES, FI, GB, GD, 
GE, GH, GM, HR, HU, ID, IL, IN, IS, JP, KE, KG, KP, KR, KZ, LC, LK, LR, LS, 
LT, LU, LV, MA, MD, MG, MK, MN, MW, MX, MZ, NO, NZ, PL, PT, RO, RU, SD, SE, 
SG, SI, SK, SL, TJ, TM, TR, TT, TZ, UA, UG, US, UZ, VN, YU, ZA, ZW RW: GH, 
GM, KE, LS, MW, MZ, SD, SL, SZ, TZ, UG, ZW, AT, BE, CH, CY, DE, DK, ES, FI, 
FR, GB, GR, IE, IT, LU, MC, NL, PT, SE, TR, BF, BJ, CF, CG, CI, CM, GA, GN, GW, 
ML, MR, NE, SN, TD, TG AU 2001075349 A5 20011217 AU 2001-75349 
20010607 EP 1311837 A2 20030521 EP 2001-942053 20010607 R: 

AT, BE, CH, DE, DK, ES, FR, GB, GR, IT, U, LU, NL, SE, MC, PT, IE, SI, LT, LV, FI, 



RO, MK, a, AL, TR US 2003224357 Al 20031204 US 2001-876549 
20010607 

PRAI US 2000-209778P P 20000607 WO 2001 -US 18424 W 20010607 
AB Method and system to predict and optimize probe-target hybridization are 
provided. The method may be implemented using six interactive, interrelated, software 
modules. Module 1 predicts the hybridization thermodn. of a duplex given the two 
strands. Module 2 finds the best primer of a given length binding to a given target. 
Module 3 executes a primer walk to find alternative binding sites of a given primer on a 
given target. Module 5 is a combination of Modules 2 and 3. Module 6 finds the 
alternative binding sites of a given primer on a given target (Module 3) and calcs. the 
concn. of target with primer bound at primary and alternative sites. Module 7 is a 
combination of Modules 2 and 5 and also calcs. the various concns. The six modules 
can be operated either through an interactive user interface or using batch file 
submission as provided by Module 4. The program is suited to predict DNA/DNA, 
RNA/RNA, and RNA/DNA systems. 
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TI Gene expression profile analysis by DNA microarrays. Promise and pitfalls 
AU King, Hadley C; Sin ha, Animesh A. 

CS Dep. Dermatology, Weill Med. College, Cornell Univ., New York, NY, USA 

SO JAMA, the Journal of the American Medical Association (2001), 286(18), 2280- 

2288 CODEN: JAMAAP; ISSN: 0098-7484 

PB American Medical Association 

DT Journal; General Review 

LA English 

AB A review with refs. DNA *** microarrays*** represent a technol. intersection 
between biol. and ***computers*** that enables gene expression anal, in human 
tissues on a genome-wide scale. This application can be expected to prove extremely 
valuable for the study of the genetic basis of complex diseases. Despite the enormous 
promise of this revolutionary technol., there are several issues and possible pitfalls that 
may undermine the authority of the microarray platform. We discuss some of the 
conceptual, practical, statistical, and logistical issues surrounding the use of 
microarrays for gene expression profiling. These issues include the imprecise definition 
of normal in expression comparisons; the cellular and subcellular heterogeneity of the 
tissues being studied; the difficulty in establishing the statistically valid comparability of 
arrays; the logistical logjam in anal., presentation, and archiving of the vast quantities 
of data generated; and the need for confirmational studies that address the functional 
relevance of findings. Although several complicated issues must be resolved, the 
potential payoff remains large. 
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Tobin, Katherine P.; Kashuk, Carl; Mathews, Debra J.; Shah, Nila A.; Eichler, Evan E.; 
Warrington, Janet A.; Chakravarti, Aravinda 

CS McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School 
of Medicine, Baltimore, MD, 21287, USA 

SO Genome Research (2001), 11(11), 1913-1925 CODEN: GEREFS; ISSN: 1088-9051 
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DT Journal 
LA English 

AB The genetic dissection of complex traits may ultimately require a large no. of SNPs 
to be genotyped in multiple individuals who exhibit phenotypic variation in a trait of 
interest. Microarray technol. can enable rapid genotyping of variation specific to study 
samples. To facilitate their use, we have developed an automated statistical method 
(ABACUS) to analyze microarray hybridization data and applied this method to 
Affymetrix Variation Detection Arrays (VDAs). ABACUS provides a quality score to 
individual genotypes, allowing investigators to focus their attention on sites that give 
accurate information. We have applied ABACUS to an expt. encompassing 32 
autosomal and eight X-linked genomic regions, each consisting of -50 kb of unique 
sequence spanning a 100-kb region, in 40 humans. At sufficiently high-quality scores, 
we are able to read -80% of all sites. To assess the accuracy of SNP detection, 108 of 
108 SNPs have been exptl. confirmed; an addnl. 371 SNPs have been confirmed 
electronically. To access the accuracy of diploid genotypes at segregating autosomal 
sites, we confirmed 1515 of 1515 homozygous calls, and 420 of 423 (99.29%) 
heterozygotes. In replicate expts., consisting of independent amplification of identical 
samples followed by hybridization to distinct microarrays of the same design, 
genotyping is highly repeatable. In an autosomal replicate expt., 813,295 of 813,295 
genotypes are called identically (including 351 heterozygotes); at an X-linked locus in 
males (haploid), 841,236 of 841,236 sites are called identically. 
RE.CNT 41 THERE ARE 41 CITED REFERENCES AVAILABLE FOR THIS RECORD 
ALL CITATIONS AVAILABLE IN THE RE FORMAT 

L6 ANSWER 339 OF 407 CAPLUS COPYRIGHT 2006 ACS on STN 
AN 2001:823845 CAPLUS 
DN 137:1068 

TI A statistical method for flagging weak spots improves normalization and ratio 
estimates in microarrays 

AU Yang, M. C. K.; Ruan, Q. G.; Yang, J. J.; Eckenrode, S.; Wu, S.; Mclndoe, R. A.; 
She, J. X. 

CS Department of Statistics, University of Florida, Gainesville, FL, 32610-0275, USA 



Page 52 of 63 



Serial No. 10/501,848 
STN SEARCH - a 
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LA English 

AB Over the last few years, there has been a dramatic increase in the use of cDNA 
microarrays to monitor gene expression changes in biol. systems. Data from these 
expts. are usually transformed into expression ratios between expti. samples and a 
common ref. sample for subsequent data anal. The accuracy of this crit. transformation 
depends on two major parameters: the signal intensities and the normalization of the 
expt. vs. ref. signal intensities. A new model for microarray signal intensity that has 
one multiplicative variation and one additive background variation was described and 
validated. Using replicative expts. and simulated data, we found that the signal 
intensity is the most crit. parameter that influences the performance of normalization, 
accuracy of ratio ests., reproducibility, specificity, and sensitivity of microarray expts. 
Therefore, we developed a statistical procedure to flag spots with weak signal intensity 
based on the std. deviation (.delta.ij) of background differences between a spot and 
the neighboring spots, i.e., a spot is considered as too weak if the signal is weaker than 
c.delta.ij. Our studies suggest that normalization and ratio ests. were unacceptable 
when this threshold (c) is small. We further showed that when a reasonable 
compromise of c (c = 6) is applied, normalization using trimmed mean of log ratios 
performed slightly better than global intensity and mean of ratios. These studies 
suggest that decreasing the background noise is ait. to improve the quality of 
microarray expts. 
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AB Microarrays can measure the expression of thousands of genes and thus identify 
changes in expression between different biol. states. Methods are needed to det. the 
significance of these changes, while accounting for the enormous no. of genes. We 
describe a new method, Significance Anal, of Microarrays (SAM), that assigns a score to 
each gene based on the change in gene expression relative to the std. deviation of 
repeated measurements. For genes with scores greater than an adjustable threshold, 
SAM uses permutations of the repeated measurements to est. the percentage of such 
genes identified by chance, the false discovery rate (FDR). When the transcriptional 
response of human cells to ionizing radiation was measured by microarrays, SAM 
identified 34 genes that changed at least 1.5-fold with an estd. FDR of 12, compared to 
FDRs of 60 and 84 using conventional methods of anal. Of the 34 genes, 19 were 
involved in cell cycle regulation, and 3 in apoptosis. Surprisingly, 4 nucleotide excision 
repair genes were induced, suggesting that this repair pathway for UV-damaged DNA 
might play a heretofore unrecognized role in repairing DNA damaged by ionizing 
radiation. 
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TI Construction of DNA microarray expression database and its data mining strategy 
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AB A review. How to establish UMS (Lab. Information Management Systems) for 
DNA array technologies and how to utilize the information in the database for data- 
mining were discussed. Image anal, softwares such as Scanalyze, ImaGene, 
QuantArray and ArrayAnalyzer and algorithms within for reading microarray results 
were described with a typical flow of the processing of DNA microarray results. The 
data- mining softwares such as Cluster, TreeView, Spotfire, Arrayscout and Genespring 
and how to ext. data from the microarray database and how to process the data were 
described. The procedures and tools for gene information annotation that would make 
cDNA database more valuable in data mining were also discussed with some actual 
examples of annotation systems such as Fly Base, NCBI and RIKEN definition. 
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amplification or deletion using comparative genomic hybridization 



IN Seelig, Steven A. 
PA Vysis, Inc., USA 

SO PCT Int. Appl., 61 pp. CODEN: PIXXD2 
DT Patent 
LA English 

FAN.CNT 1 PATENT NO. KIND DATE APPLICATION NO. DATE 



PI WO 2001075160 Al 20011011 WO 2001 -US 10063 20010329 W: 
AU, CA, CN, JP, KR RW: AT, BE, CH, CY, DE, DK, ES, FI, FR, GB, GR, IE, IT, LU, MC, 
NL, PT, SE, TR CA 2402320 AA 20011011 CA 2001-2402320 

20010329 EP 1268860 Al 20030102 EP 2001-964688 20010329 R: 
AT, BE, CH, DE, DK, ES, FR, GB, GR, IT, LI, LU, NL, SE, MC, PT, IE, FI, CY, TR 
PRAI US 2000-539400 A 20000331 WO 2001 -US 10063 W 20010329 
AB The method of the invention comprises the classification of a cancer patient 
population into various cancer therapy groups based on anal, by genomic DNA 
microarray of multiple gene amplifications or deletions present or absent in the 
diseased tissue of each patient. In particular, the invention involves patient 
classification into one of at least four cancer therapy groups based on the microarray 
anal, of gene amplification or gene deletion at multiple chromosome locations. The 
invention has the significant din. advantage of guiding selection of expensive cancer 
adjuvant drugs for use with patients most likely to respond pos. to the individual drug. 
For example, a genomic DNA microarray simultaneously measuring 59 sep. gene 
amplifications or gene deletions in diseased tissue can be used to stratify solid tumor 
cancer patients, such as breast cancer patients, into at least nine groups: those most 
likely to respond to (i) anti-HER-2/neu therapy (Herceptin), (ii) anti-EGFR therapy 
(C225 antibody), (iii) anti-AKTl therapy (cis-platin), (iv) anti-PIK3CA therapy, (v) anti- 
thymidylate synthase therapy (5-fluorouracil), (iv) anti-Topoisomerase II therapy 
(doxorubicin), (vii) anti-cmyc therapy, (viii) combination of anti-HER-2 therapy and 
anti-AKTl therapy, and (ix) combination of anti-EGFR and anti-AKTl therapy. The 
invention has the significant din. advantage of guiding selection of expensive cancer 
adjuvant drugs for use with patients most likely to respond pos. to the individual drug 
or respond synergisticalty to a particular combination of adjuvant therapies. The 
invention has yet another advantage, compared to use of nucleic acid microarrays 
measuring only gene expression changes in the diseased tissue from normal tissue, of 
measuring changes in a more stable analyte-chromosomal DNA, than the labile mRNA 
necessary for gene expression anal. 
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AB A review of .sigma.54-dependent genes in Escherichia coli. .sigma.54 Has several 
features that distinguish it from other sigma factors in Escherichia coli: it is not 
homologous to other .sigma. subunits, .sigma.54-dependent expression absolutely 
requires an activator, and the activator binding sites can be far from the transcription 
start site. A rationale for these properties has not been readily apparent, in part 
because of an inability to assign a common physiol. function for .sigma. 54-dependent 
genes. Surveys of .sigma.54-dependent genes from a variety of organisms suggest 
that the products of these genes are often involved in nitrogen assimilation; however, 
many are not. Such broad surveys inevitably remove the .sigma.54-dependent genes 
from a potentially coherent metabolic context. To address this concern, we consider 
the function and metabolic context of .sigma. 54-dependent genes primarily from a 
single organism, Escherichia coli, in which a reasonably complete list of .sigma. 54- 
dependent genes has been identified by ***computer*** anal, combined with a 
DNA ***microarray*** anal, of nitrogen limitation-induced genes. E. coli appears to 
have approx. 30 .sigma. 54-dependent operons, and about half are involved in nitrogen 
assimilation and metab. A possible physiol. relationship between .sigma. 54-dependent 
genes may be based on the fact that nitrogen assimilation consumes energy and 
intermediates of central metab. The products of the .sigma.54-dependent genes that 
are not involved in nitrogen metab. may prevent depletion of metabolites and energy 
resources in certain environments or partially neutralize adverse conditions. Such a 
relationship may limit the no. of physiol. themes of .sigma. 54-dependent genes within 
a single organism and may partially account for the unique features of .sigma.54 and 
.sigma.54-dependent gene expression. 
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AB A review with refs. DNA ***microarray*** technol., also called "genechip" 
technol., it incorporates mol. genetics and ***computer*** science on a massive 
scale. This technol. can rapidly provide a detailed view of the simultaneous expression 
of entire genomes and provide new insights into gene function, disease pathophysiol., 
disease classification, and drug development. In this review, the author discusses the 
basic theory behind genechip and other biol. chip technologies, their limitations given 
the current state of biol. knowledge and computational abilities, and their potential 
applications to the understanding of neural, disorders. 
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AB A data mining package, called GeneSight, was developed to analyze quant, 
microarray data. It provides a no. of query tools and features for the desktop user to 
address both numerical and text-based needs. The package contains two specific tools 
designed to query a data set and assocd. textual information: the Template Matcher, 
for numerical queries into a data set, and the Query/Group Builder tool, for numerical 
queries into a data set, and the Query/Group Builder tool, for numerical and text-based 
queries into a data set as well as an underlying database contg. information about the 
genes present on a chip. In addn., this program contains several general features that 
aid the investigator in rapidly identifying specific genes based on their unique identifier 
within a data-set. The use of these tools and its specific features for exploring a 
modest-sized data set are described. 
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AB Labs, use different laser-based scanners to scan microarray images. To assess 
whether results from different scanners are comparable, and thus whether data from 
different labs, can be compared, we scanned the same microarray slide with three com. 
scanners that-use different imaging techniques. After the acquisition of the microarray 
images produced by the three scanners, the images were quantified using a single 
imaging software package and protocol. The results were compared, and we found 
that the data obtained from the three scanners were comparable and that the 
variations caused by the use of different instruments were negligible, in spite of the 
fact that the scanners were based on different optical imaging techniques. 
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AB A method for the prepn. of small mol. microarrays using positionally encoded 
libraries and its application to functional proteomic profiling in a model system are 
presented. Libraries of small mols. tethered to peptidonudeic acid tags were 
constructed. The feasibility of this method was demonstrated using mechanism-based 
cysteine protease inhibitors contg. an acrylamide functionality. The PNA tag 
insignificantly affected the activity or selectivity of the inhibitor. A series of compds. 
designed to inhibit cathepsin S, L, H, B, C, and calpain were synthesized. Results 
showed that the proposed size exclusion sepn. is effective to sep. the bound PNA- 
ligand conjugates from the unbounding ones, and that PNA is efficient for positional 
encoding. Small mol.-PNA conjugates could be used to probe protein function in a 
microarray format. The ability to array small mol. libraries prepd. by split-pool 
combinatorial synthesis in a spatially addressable format allows for multiplexed 
screening in a highly miniaturized format to generate profiles of cellular activity. 
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AB This paper presents a simulator for gene expression networks, based on the model 
of chain dynamical systems (CDS). It gives the definition of CDS, describes the 
simulator architecture, the language adopted for describing CDS, and the available 
outputs. Finally, a real genetic network is studied: a subsystem of the genetic network 
that controls cell cycle of adrenocortical cells of the Yl cultured cell line. 
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AB This paper presents a parallel program for assessing the codetn. of gene 
transcriptional states from large- scale simultaneous gene expression measurements 
with cDNA microarrays. The parallel program is based on a nonlinear statistical 
framework recently proposed for the anal, of gene interaction via multivariate 
expression arrays. Parallel computing is key in the application of the statistical 
framework to a large set of genes because a prohibitive amt. of computer time is 
required on a classical single-CPU machine. Our parallel program, named the Parallel 
Anal, of Gene Expression (PAGE) program, exploits inherent parallelism exhibited in the 
proposed codetn. prediction models. By running PAGE on 64 processors in Beowulf, a 
clustered parallel system, an anal, of melanoma cDNA *** microarray*** expression 
data has been completed within 12 days of ***computer*** time, an anal, that 
would have required about one and half years on a single-CPU computing system. A 
data visualization program, named the Visualization of Gene Expression (VOGE) 
program, has been developed to help interpret the massive amt. of quant, information 
produced by PAGE. VOGE provides graphical data visualization and anal, tools with 
filters, histograms, and accesses to other genetic databanks for further analyses of the 
quant, information. 
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AB The images resulting from cDNA microarrays are highly random. There are many 
aspects to this randomness, including spot size, shape, intensity, uniformity, and 
circularity, as well as both foreground and background noise. This paper presents a 
random model for the generation of microarray images. The model is complicated and 
contains over 20 parameters. It can be used to test ***microarray*** imaging 
algorithms and to ** 'simulate*** the effects of various dependencies within the 
image formation process. 
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AB Anal, of microarray data often involves extg. information from raw intensities of 
spots of cells and making certain calls. Rank-based algorithms are powerful tools to 
provide probability values of hypothesis tests, esp. when the distribution of the 
intensities is unknown. For our current gene expression arrays, a gene is detected by a 
set of probe pairs consisting of perfect match and mismatch cells. The one-sided 
upper-tail Wilcoxon's signed rank test is used in our algorithms for abs. calls (whether a 
gene is detected or not), as well as comparative calls (whether a gene is increasing or 
decreasing or no significant change in a sample compared with another sample). We 
also test the possibility to use only perfect match cells to make calls. This paper 
focuses on abs. calls. We have developed error anal, methods and software tools that 
allow us to compare the accuracy of the calls in the presence or absence of mismatch 
cells at different target concns. The usage of nonparametric rank-based tests is not 
limited to abs. and comparative calls of gene expression chips. They can also be 
applied to other oligonucleotide microarrays for genotyping and mutation detection, as 
well as spotted arrays. 
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AB To more accurately measure fluorescent signals from microarrays, we calibrated 
our acquisition and anal, systems by using groundtruth samples comprised of known 
quantities of red and green gene-specific DNA probes hybridized to cDNA targets. We 
imaged the slides with a full-field, white light CCD imager and analyzed them with our 
custom anal, software. Here we compare, for multiple genes, results obtained with and 
without preprocessing (alignment, color crosstalk compensation, dark field subtraction, 
and integration time). We also evaluate the accuracy of various image processing and 
anal, techniques (background subtraction, segmentation, quantitation and 
normalization). This methodol. calibrates and validates our system for accurate quant, 
measurement of microarrays. Specifically, we show that preprocessing the images 
produces results substantially closer to the known groundtruth for these samples. 
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AB The present invention provides a microarray substrate comprising a plurality of 
photodetectors integrated therein. The invention further provides a detection device 
for use in conjunction with a microarray substrate of the invention, as well as methods 
of use of same. 
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AB One of the limitations of classical sequencing by hybridization (SBH) is the 
inefficient use of probes in the "all k-mers" array. This limitation occurs due to the 
relatively short length (roughly C) of target that may be reconstructed by an array with 
C probes. We propose a new strategy, multiplex sequencing by hybridization, that 
greatly increases the efficiency of target reconstruction. In the typical multiplex SBH 
method, many different target sequences are simultaneously reconstructed (as 
compared to a single sequence in classic SBH). This is accomplished by pooling the 
target sequences and performing several hybridization expts. This procedure makes 



more efficient use of probes so that the combined length of sequence reconstructed 
per DNA array increases significantly as compared to classical SBH. 
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AB We consider the problem of inferring fold changes in gene expression from cDNA 
microarray data. Std., procedures focus on the ratio of measured fluorescent intensities 
at each spot on the microarray, but to do so is to ignore the fact that the variation of 
such ratios is not const. Ests. of gene expression changes are derived within a simple 
hierarchical model that accounts for measurement error and fluctuations in abs. gene 
expression levels. Significant gene expression changes are identified by deriving the 
posterior odds of change within a similar model. The methods are tested via 
♦♦♦simulation*** and are applied to a panel of Escherichia coli ***microarrays*** . 
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AB Fundamental investigations of protein crystn. using microarrayed multiple cell Si- 
devices were proposed for achieving heterogeneous nucleation/crystn. and also for 
screening expts. Surface-potential (.zeta. -potential) controlled nucleation and crystn. 
sites made from deposited thin-film semiconductor and insulating materials were 
fabricated on the surface of each crystal growth cell. .zeta. -Potential measurement 
using the electrophoretic light scattering spectrophotometric method showed that both 
of ionic strength and pH values had a great influence on the potential of solid material 
surfaces, such as n/p-Si, Si02, Si3N4, and AI203, and also protein ones. Isoelec. point 
of protein was influenced and shifted with the ionic strength, but point of zero charge 
of our solid material surface was still unchanged. Then the conventional microarrayed 
configuration was adopted in our device, but each unit well was composed of single 
reservoir and paired multiple crystal growth cells to prep, the protein droplets of 
different buffer and precipitant concns. The no. of our multiple growth cells in a unit 
well was at least 2, and the available vol. for protein drop ranged from 1 to 10 .mu.l. 
This cell configuration and sample prepn. was expected to cover the whole effective pH 
and concn. regions for heterogeneous nucleation and crystn. and accordingly 
accelerate the screening expts. without changing conventional reagents and protocols. 
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AB The altered gene expression of K562 cell after treatment with arsenic sulfide was 
detected by cDNA microarray. Two fluorescence cDNA probes were made from mRNA 
of arsenic sulfide untreated or treated K562 cells, marked with two different fluorescent 
dyes, cy3 or cy5 resp., hybridized with expressed cDNA ***microarray*** scanned 
and analyzed by ***computer*** system and finally detg. the altered expression of 
the gene. Eleven genes were identified, related to cell cyde, DNA transcription and 
transcription factors, and protein translation which were expressed differentiy after 
treatment with arsenic sulfide: 7/11 elevated, 4/11 depressed. It is suggested that 
cydin E2, cydin G2 may take part in the process of K562 cell apoptosis induced by 
arsenic sulfide. 
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AB In order to understand the structure of DNAs and their interactions when on 
***microarray*** surfaces, we performed the first all-atom mol. dynamics 
♦♦♦simulation*** of DNA tethered to a surface. On the surface, the binding of the 
DNA was enhanced, and its av. equil. conformation was the B form. The DNA duplex 
spontaneously tilted towards its nearest neighbor and settled in a leaning position with 
a interaxial distance of 2.2 nm. This dose packing of the DNAs, which affects both in 
situ synthesis and deposition of probes on microarray surfaces, can thus be explained 
by salted-induced colloidlike DNA-DNA attractions. 
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AB AMADA is a Windows program for identifying co-expressed genes from microarray 
data. It performs data transformation, principal component anal., a variety of cluster 
analyses and extensive graphic functions for visualizing expression profiles. 
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AB DNA microarrays are now capable of providing genome- wide patterns of gene 
expression across many different conditions. The first level of anal, of these patterns 
requires detg. whether obsd. differences in expression are significant or not. Current 
methods are unsatisfactory due to the lack of a systematic framework that can 
accommodate noise, variability, and low replication often typical of microarray data. A 
Bayesian probabilistic framework for microarray data anal, was developed. At the 
simplest level, we model log-expression values by independent normal distributions, 
parameterized by corresponding means and variances with hierarchical prior 
distributions. We derive point ests. for both parameters and hyperpara meters, and 
regularized expressions for the variance of each gene by combining the empirical 
variance with a local background variance assocd. with neighboring genes. An addnl. 
hyperpara meter, inversely related to the no. of empirical observations, dets. the 
strength of the background variance. Simulations show that these point ests., 
combined with a t-test, provide a systematic inference approach that compares 
favorably with simple t-test or fold methods, and partly compensate for the lack of 
replication. 
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AB A survey and comparative anal, of microarray databases was undertaken in order 
to obtain a better understanding of the available systems. The survey included 
databases that are currently available, as well as databases that should become 
available in early 2001. Databases fall into three categories: those that can be installed 
locally, those available for public data submission, and those available for public query. 
Developers of microarray gene expression databases were asked questions regarding 
the scope and availability of their database, its system requirements, its future 
compliance with MGED (Microarray Gene Expression Database) stds., and its assocd. 
anal, tools. Each database fulfils a different role, reflecting the widely varying needs of 
microarray users. 
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AB A review. The authors introduce a methodol. for inducing predictive rule models 
for functional classification of gene expressions from microarray hybridization expts. 
The basic learning method is the rough set framework for rule induction. The 
methodol. is different from the commonly used unsupervised clustering approaches in 
that it exploits background knowledge of gene function in a supervised manner. Genes 
are annotated using Ashburner's Gene Ontol. and the functional classes used for 
learning are mined from these annotations. From the original expression data, we ext. 
a set of biol. meaningful features that are used for learning. A rule model is induced 
from the data described in terms of these features. Its predictive quality is fine-tuned 
via cross-validation on subsets of the known genes prior to classification of unknown 
genes. The predictive and descriptive quality of such a rule model is demonstrated on 
the fibroblast serum response data previously analyzed by V. R. Iyer et al. (1999). This 
anal, shows that the rules are capable of representing the complex relationship 
between gene expressions and function, and that it is possible to put forward high 
quality hypotheses about the function of unknown genes. 
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AB CDNA microarraytechnol. is useful for systematically analyzing the expression 
profiles of thousands of genes at once. Although many useful results inferred by using 
this technol. and a hierarchical clustering method for statistical anal, have been 
confirmed using other methods, there are still questions about the reproducibility of the 
data. We have therefore developed a data processing method that very efficiently 
exts. reproducible data from the result of duplicate expts. It is designed to 
automatically filter the raw results obtained from cDNA microarray image-anal, 
software. We optimize the threshold value for filtering the data by using the product of 
N and R, where N is the ratio of the no. of spots that passed the filtering vs. the total 
no. of spots, and R is the correlation coeff. for results obtained in the duplicate expts. 
Using this method to process mouse tissue expression profile data that contain 
1,881,600 points of anal., we obtained clustered results more reasonable than those 
obtained using previously reported filtering methods. 
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AB Ouster anal, of genome-wide expression data from DNA microarray hybridization 
studies has proved to be a useful tool for identifying biol. relevant groupings of genes 
and samples. In the present paper, we focus on several important issues related to 
clustering algorithms that have not yet been fully studied. We describe a simple and 
robust algorithm for the clustering of temporal gene expression profiles that is based 
on the simulated annealing procedure. In general, this algorithm guarantees to 
eventually find the globally optimal distribution of genes over clusters. We introduce an 
iterative scheme that serves to evaluate quant, the optimal no. of clusters for each 
specific data set. The scheme is based on std. approaches used in regular statistical 
tests. The basic idea is to organize the search of the optimal no. of clusters 
simultaneously with the optimization of the distribution of genes over dusters. The 
efficiency of the proposed algorithm has been evaluated by means of a reverse 
engineering expt, i.e., a situation in which the correct distribution of genes over 
dusters is known a priori. The employment of this statistically rigorous test has shown 
that our algorithm places greater than 90% genes into correct dusters. Finally, the 
algorithm has been tested on real gene expression data (expression changes during 
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yeast cell cycle) for which the fundamental patterns of gene expression and the 
assignment of genes to clusters are well understood from numerous previous studies. 
The source code of the program implementing the algorithm is available upon request 
from the authors. 
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AB A microarray scanning system for conducting expts. on a planar substrate includes 
an app. for translating the secured substrate in two axes, the substrate having at least 
one fiducial mark on the planar substrate as a means for positioning and aligning the 
substrate for subsequent spot placement, anal., or comparison procedures. 
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TI A system for finding association rules from microarray data and public databases 
AU Naitou, Takahiro; Satou, Kenji; Furuichi, Emiko; Kuhara, Satoru; Takagi, Toshihisa 
CS School of Knowledge Science, Japan Advanced Institute of Science and 
Technology, Ishikawa, 923-1292, Japan 

SO Genome Informatics Series (2000), ll(Genome Informatics 2000), 356-357 
CODEN: GINSE9; ISSN: 0919-9454 
PB Universal Academy Press 
DT Journal 
LA English 

AB As the research trend of genome anal, changes from sequencing to gene 
expression and genetic network identification, the microarray has attracted a great deal 
of attention. However, the methodol. for extg. knowledge from a set of microarray 
data is not yet established. Consequently, the clustering of genes according to their 
expressivity is still the most popular way for the first anal, to be tried. Since the data 
obtained from a microarray expt. consists solely of pairs of gene names and their 
expressivity, it is needed to combine this data with other information extd. from public 
databases in order to find useful knowledge. To solve this problem, a Web-based anal, 
system which can discovery assocn. rules from a combination of microarray data and 
public databases, is being developed. The functionalities of the prototype system 
currently under development, are described. 
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TI Practical organization and functional annotation of RIKEN cDNA microarray 
AU Bono, Hidemasa; Kasukawa, Takeya; Miki, Rika; Kadota, Koji; Okazaki, Yasushi; 
Hayashizaki, Yoshihide 

CS Laboratory for Genome Exploration Research Group, RIKEN Genomic Sciences 

Center (GSC), RIKEN Yokohama Institute, Kanagawa, 230-0045, Japan 

SO Genome Informatics Series (2000), ll(Genome Informatics 2000), 260-261 

CODEN: GINSE9; ISSN: 0919-9454 

PB Universal Academy Press 

DT Journal 
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AB An important issue in using DNA microarray technol. for analyzing gene 
expressions is the ability to predict gene functions from the similarity and dissimilarity 
of expression' patterns. With this ability, the functional inference of genes that cannot 
be inferred from any sequence analyses can be readily obtained. However, this 
requires effective computational methods and resources. In this regard, a web-based 
system, called READ (Riken Expression Array Database), for analyzing cDNA microarray 
data from the RIKEN mouse 19K set is under development. The READ system 
organizes all the information that needs to be referred in the anal, phase, including the 
results of various sequence analyses. 
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TI Data submission system for cyanobacterial DNA chip consortium 

AU Uchiyama, Ikuo; Miwa, Tomoki; Nishide, Hiroyo; Suzuki, Iwane; Omata, Tatsuo; 

Ikeuchi, Masahiko; Murata, Norio; Kanehisa, Minoru 



CS National Institute for Basic Biology, Research Center for Computational Science, 

Okazaki National Research Institutes, Okazaki, 444-8585, Japan 

SO Genome Informatics Series (2000), ll(Genome Informatics 2000), 235-236 

CODEN: GINSE9; ISSN: 0919-9454 

PB Universal Academy Press 

DT Journal 

LA English 

AB A DNA microarray is an extremely useful tool for functional genomics by surveying 
genome-wide gene expression changes in cells under different conditions. A 
cyanobacterial DNA chip consortium established under the Genome Frontier Project, 
consists of Japanese cyanobacterial researchers with a wide range of interests. In each 
member's lab., expts. are underway using the same chip on which segments of almost 
all ORFs identified in Synechocystis sp. PCC6803 genome are spotted, but using various 
materials subjected to changes in different environmental conditions such as temp., 
light intensity or C02 concn. The data submission system for this consortium is hereby 
presented. The system accepts and manages data about expti. conditions and relates 
them to the submitted expression data file produced from the image anal, software. 
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TI KEGG/ EXPRESSION : a database for browsing and analyzing microarray expression 
data 

AU Goto, Susumu; Kawashima, Shuichi; Okuji, Yoshinori; Kamiya, Tomomi; Miyazaki, 
Satoshi; Numata, Youjiro; Kanehisa, Minoru 

CS Institute for Chemical Research, Kyoto University, Kyoto, 611-0011, Japan 
SO Genome Informatics Series (2000), ll(Genome Informatics 2000), 222-223 
CODEN: GINSE9; ISSN: 0919-9454 
PB Universal Academy Press 
DT Journal 
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AB A prototype of KEGG/ EXPRESSION database was developed for storing publicly 
available yeast microarray data, which was used to analyze the aminoacyl-tRNA 
synthase behavior using the data. This database was integrated into the 
DBGET/LinkDB system, and new Java applets for browsing and analyzing the data were 
developed. Each entry of the KEGG/ EXPRESSION in the DBGET/Link DB system 
corresponds to an array and contains the information on entry name, accession 
identification, brief description of the expt., conditions of the control and target arrays, 
experimenters, data and organism. This entry is in the HTML format and is a starting 
point to the browsing methods. The expression data from each expt. can be browsed 
as either a microarray image or a scatter plot. Three types of plots are available, a plot 
for whole genes, classified plots according to the KEGG functional classification, and 
classified plots according to the functional classification by the sequencing group of the 
organism. This database also links to various anal, tools from the result of the browser 
and the list of genes. 
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TI *** Microarray*** analysis of genes differentially expressed in HepG2 celts 

cultured in ***simulated*** microgravity: Preliminary report 

AU Khaoustov, Vladimir I.; Risin, Diana; Pellis, Neal R.; Yoffe, Boris 

CS Department of Medicine, Veterans Affairs Medical Center, Baylor College of 

Medicine, Houston, TX, 77030, USA 

SO In Vitro Cellular & Developmental Biology: Animal (2001), 37(2), 84-88 CODEN: 
IVCAED; ISSN: 1071-2690 
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DT Journal; General Review 
LA English 

AB A review with 18 refs. including the authors' own works. Developed at NASA, the 
rotary cell culture system (RCCS) allows the creation of unique microgravity 
environment of low shear force, high-mass transfer, and enables 3-dimensional (3D) 
cell culture of dissimilar cell types. Recently the authors demonstrated that a simulated 
microgravity is conducive for maintaining long-term cultures of functional hepatocytes 
and promote 3D cell assembly. Using DNA microarray technol., it is now possible to 
measure the levels of thousands of different messenger ribonucleic acids (mRNAs) in a 
single hybridization step. This technique is particularly powerful for comparing gene 
expression in the same tissue under different environmental conditions. The aim of 
this research was to analyze gene expression of hepatoblastoma cell line (HepG2) 
during early stage of 3D-cell assembly in simulated microgravity. For this, mRNA from 
HepG2 cultured in the RCCS was analyzed by DNA microarray. Analyses of HepG2 
mRNA by 6K glass DNA microarray revealed changes in expression of 95 genes 
(overexpression of 85 genes and downregulation of 10 genes). The preliminary results 
indicated that * "simulated*** microgravity modifies the expression of several 
genes and that ***microarray*** technol. may provide new understanding of the 
fundamental biol. questions of how gravity affects the development and function of 
individual cells. 
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AU Sherlock, Gavin; Hemandez-Boussard, Tina; Kasarskis, Andrew; Binkley, Gail; 
Matese, John C; Dwight, Selina S.; Kaloper, Miroslava; Weng, Shuai; Jin, Heng; Ball, 
Catherine A.; Eisen, Michael B.; Spellman, Paul T.; Brown, Patrick 0.; Botstein, David; 
Cherry, J. Michael 

CS Department of Genetics, Center for Clinical Sciences Research, Stanford University, 
Stanford, CA, 94305-5163, USA 

SO Nucleic Acids Research (2001), 29(1), 152-155 CODEN: NARHAD; ISSN: 0305- 
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DT Journal 
LA English 

AB The Stanford Microarray Database (SMD) stores raw and normalized data from 
microarray expts., and provides web interfaces for researchers to retrieve, analyze and 
visualize their data. The two immediate goals for SMD are to serve as a storage site 
for microarray data from ongoing research at Stanford University, and to facilitate the 
public dissemination of that data once published, or released by the researcher. Of 
paramount importance is the connection of microarray data with the biol. data that 
pertains to the DNA deposited on the microarray (genes, clones etc.). SMD makes use 
of many public resources to connect expression information to the relevant biol., 
including SGD [Ball,C.A., Dolinski,K., Dwight,S.S., Harris,M.A., Issel-Tarver,L, 
Kasarskis,A., Scafe,C.R., Sherlock,G., Binkley,G., Jin,H. et al. (2000) Nucleic Acids Res., 
28, 77-80], YPD and WormPD [Costanzo,M.C, Hogan,J.D., Cusick,M.E., Davis,B.P., 
Fancher,A.M., Hodges,P.E., Kondu,P., Lengieza,C, Lew-Smith,j.E., Lingner,C. et al. 
(2000) Nucleic Acids Res., 28, 73-76], Unigene [Wheeler,D.L., Chappey,C, Lash,A.E., 
Leipe,D.D., Madden,T.L, Schuler,G.D.,J TatusovaJ.A. and Rapp,B.A. (2000) Nucleic 
Acids Res., 28, 10-14], dbEST [Boguski,M.S., Lowe,T.M. and Tolstoshev,CM. (1993) 
Nature Genet., 4, 332-333] and SWISS-PROT [Bairoch,A. and Apweiler,R. (2000) 
Nucleic Acids Res., 28, 45-48] and can be accessed at http://genome- 
www.stanford.edu/microarray. 

RE.CNT 12 THERE ARE 12 CUED REFERENCES AVAILABLE FOR THIS RECORD 
ALL CITATIONS AVAILABLE IN THE RE FORMAT 

L6 ANSWER 372 OF 407 CAPLUS COPYRIGHT 2006 ACS on STN 
AN 2001:294318 CAPLUS 
DN 136:1470 

TI Target gene search for the metal-responsive transcription factor MTF-1 

AU Lichtlen, P.; Wang, Y.; Belser, T.; Georgiev, O.; Certa, U; Sack, R.; Schaffner, W. 

CS Institute of Molecular Biology, University of Zurich, Zurich, CH-8057, Switz. 

SO Nucleic Acids Research (2001), 29(7), 1514-1523 CODEN: NARHAD; ISSN: 0305- 

1048 

PB Oxford University Press 
DT Journal 
LA English 

AB Activation of genes by heavy metals, notably zinc, cadmium and copper, depends 
on MTF-1, a unique zinc finger transcription factor conserved from insects to human. 
Knockout of MTF-1 in the mouse results in embryonic lethality due to liver decay, while 
knockout of its best characterized target genes, the stress-inducible metallothionein 
genes I and II, is viable, suggesting addnl. target genes of MTF-1. Here we report on 
a multi-pronged search for potential target genes of MTF-1, including 
***microarray*** screening, SABRE selective amplification, a ***computer*** 
search for MREs (DNA-binding sites of MTF-1) and transfection of reporter genes driven 
by candidate gene promoters. Some new candidate target genes emerged, including 
those encoding .alpha.-fetoprotein, the liver-enriched transcription factor C/EBP.alpha.: 
and tear lipocalin/von Ebner's gland protein, all of which have a role in toxicity/the cell 
stress response. In contrast, expression of other cell stress-assocd. genes, such as 
those for superoxide dismutases, thioredoxin and heat shock proteins, do not appear to 
be affected by loss of MTF-1. Our expts. have also exposed some problems with target 
gene searches. First, finding the optimal time window for detecting MTF-1 target genes 
in a lethal phenotype of rapid liver decay proved problematical: 12.5-day-old mouse 
embryos (stage E12.5) yielded hardly any differentially expressed genes, whereas at 
stage 13.0 reduced expression of secretory liver proteins probably reflected the onset 
of liver decay, i.e. a secondary effect. Likewise, up-regulation of some proliferation- 
assocd. genes may also just reflect responses to the concomitant loss of 'hepatocytes. 
Another sobering finding concerns .gamma. -glutamylcysteine synthetase he 
(.gamma.GCShc), which controls synthesis of the antioxidant glutathione and which 
was previously suggested to be a target gene contributing to the lethal phenotype in 
MTF-1 knockout mice. .gamma.-GCShc mRNA is reduced at the onset of liver decay 
but MTF-1 null mutant embryos manage to maintain a very high glutathione level until 
shortly before that stage, perhaps in an attempt to compensate for low expression of 
metallothioneins, which also have a role as antioxidants. 
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TI Recovering filter-based microarray data for pathways analysis using a multipoint 
alignment strategy 

AU Reid, Robert; Dix, David J,; Miller, David; Krawetz, Stephen A. 

CS Wayne State University School of Medicine, Detroit, MI, 48201, USA 

SO BioTechniques (2001), 30(4), 762-764,766,768 CODEN: BTNQDO; ISSN: 0736- 

6205 

PB Eaton Publishing Co. 
DT Journal 
LA English 

AB The use of com. microarrays is rapidly becoming the method of choice for profiling 
gene expression and assessing various disease states. Research Genetics has provided 
a series of biol. and software tools to the research community for these analyses. The 
fidelity of data anal, using these tools is dependent on a series of well-defined ref. 



control points in the array. During the course of our investigations, it became apparent 
that in some instances the ref. control points that are required for anal, became lost in 
background noise. This effectively halted the anal, and the recovery of any information 
contained within that expt. To recover this data and to increase anal, veracity, the 
simple strategy of superimposing a template of ref. control points onto the exptl. array 
was developed. The utility of this tool is established in this communication. 
RE.CNT 14 THERE ARE 14 CITED REFERENCES AVAILABLE FOR THIS RECORD 
ALL CITATIONS AVAILABLE IN THE RE FORMAT 

L6 ANSWER 374 OF 407 CAPLUS COPYRIGHT 2006 ACS on STN 
AN 2001:266749 CAPLUS 
DN 135:353397 

TI Analysis of variance for gene expression microarray data 
AU Kerr, M. Kathleen; Martin, Mitchell; Churchill, Gary A. 
CS The Jackson Laboratory, Bar Harbor, ME, 04609, USA 

SO Journal of Computational Biology (2000), 7(6), 819-837 CODEN: JCOBEM; ISSN: 
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DT Journal 
LA English 

AB Spotted cDNA microarrays are emerging as a powerful and cost-effective tool for 
large-scale anal, of gene expression. Microarrays can be used to measure the relative 
quantities of specific mRNAs in two or more tissue samples for thousands of genes 
simultaneously. While the power of this technol. has been recognized, many open 
questions remain about appropriate anal, of microarray data. One question is how to 
make valid ests. of the relative expression for genes that are not biased by ancillary 
sources of variation. Recognizing that there is inherent "noise" in microarray data, how 
does one est. the error variation assocd. with an estd. change in expression, i.e., how 
does one construct the error bars. The authors demonstrate that ANOVA methods can 
be used to normalize microarray data and provide ests. of changes in gene expression 
that are cor. for potential confounding effects. This approach establishes a framework 
for the general anal, and interpretation of microarray data. 
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TI Methods and computer software products for multiple probe gene expression 
analysis 
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PI WO 2001023614 Al 20010405 WO 2000-US26732 20000928 W: 
AE, AG, AL, AM, AT, AU, AZ, BA, BB, BG, BR, BY, BZ, CA, CH, CN, CR, CU, CZ, DE, 
DK, DM, DZ, EE, ES, FI, GB, GD, GE, GH, GM, HR, HU, ID, IL, IN, IS, JP, KE, KG, 
KP, KR, KZ, LC, LK, LR, LS, LT, LU, LV, MA, MD, MG, MK, MN, MW, MX, MZ, NO, 
NZ, PL, FT, RO, RU, SD, SE, SG, SI, SK, SL, TJ, TM, TR, TT, TZ, UA, UG, US, UZ, 
VN, YU, ZA, ZW, AM, AZ, BY, KG, KZ, MD, RU, TJ, TM RW: GH, GM, KE, LS, 
MW, MZ, SD, SL, SZ, TZ, UG, ZW, AT, BE, CH, CY, DE, DK, ES, FI, FR, GB, GR, IE, 
IT, LU, MC, NL, PT, SE, BF, BJ, CF, CG, CI, CM, GA, GN, GW, ML, MR, NE, SN, TD, 
TG US 6505125 Bl 20030107 US 2000-670510 20000926 AU 

2000077309 A5 20010430 AU 2000-77309 20000928 US 2003216868 
Al 20031120 US 2002-315923 20021209 

PRAI US 1999-156353P P 19990928 US 2000-208956P P 20000531 
US 2000-670510 A 20000926 WO 2000-US26732 W 20000928 
AB Methods and computer software products are provided for analyzing gene 
expression data. In one embodiment, the expression of a gene is detd. by multiple 
probes in several expts. A principal component anal, is performed to obtain the relative 
expression of the gene in these expts. 
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TI Design and on-chip synthesis technology of oligonucleotide microarray 
AU Lu, Zuhong; Zhao, Yujie; He, Nongyao; Sun, Xiao 
CS National Laboratory for Molecular and Biomolecular Electronics, Southeast 
University, Nanjing, 210096, Peop. Rep. China 
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AB A review, with refs., discussing genechip engineering including a set of techniques, 
such as chip fabrication, target gene prepn. and hybridization, pattern detection and 
processing, bioinformatics related to the probe design and data anal. Some results in 
on-chip synthesizing the oligonucleotides microarray with mol. stamping or microfluidic 
molds, and developing software for probe designs are presented. 
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TI Method and system for overlaying at least three microarray images to obtain a 

multicolor composite image 

IN Stephan, Todd J.; Noblett, David A.; Yang, Jun 
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DT Patent 
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PI WO 2001016583 A2 20010308 WO 2000-US40806 20000901 W: 
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AB Disclosed are a method and system for overlaying at least three microarray images 
to obtain a multicolor composite image, which is then displayed on a monitor of a 
computer system. The microarray images are taken from a microarray scanner of a 
DNA microarray and can be viewed simultaneously through the use of the image 
overlays where each image is represented by a different color. Each pixel of the 
composite image is generated by the OR operator applied to all corresponding pixels of 
the microarray images. Registration of the microarray images can be altered with a 
keyboard or mouse of the computer system. 
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TI Experimental annotation of the human genome using microarray technology 
AU Shoemaker, D. D.; Schadt, E. E.; Armour, C. D.; He, Y. D.; Garrett-Engele, P.; 
McDonagh, P. D.; Loerch, P. M.; Leonardson, A.; Lum, P. Y.; Cavet, G.; Wu, L. F.; 
Altschuler, S. J.; Edwards, S.; King, J.; Tsang, J. S.; Schimmack, G.; Schelter, J. M.; 
Koch, J.; Ziman, M.; Marton, M. J.; Li, B.; Cundiff, P.; Ward, T.; Castle, X; Krolewski, 
M.; Meyer, M. R.; Mao, M.; Burchard, J.; Kidd, M. J.; Dal, H.; Phillips, J. W.; Unsley, P. 
S.; Stoughton, R.; Scherer, S.; Boguski, M. S. 
CS Rosetta Inpharmatics, Inc., Kirkland, WA, 98034, USA 

SO Nature (London, United Kingdom) (2001), 409(6822), 922-927 CODEN: NATUAS; 
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PB Nature Publishing Group 

DT Journal 

LA English 

AB The most important product of the sequencing of a genome is a complete, 
accurate catalog of genes and their products, primarily mRNA transcripts and their 
cognate proteins. Such a catalog cannot be constructed by computational annotation 
alone; it requires exptl. validation on a genome scale. Using 'exon' and 'tiling' arrays 
fabricated by ink-jet oligonucleotide synthesis, the authors devised an exptl. approach 
to validate and refine computational gene predictions and define full-length transcripts 
on the basis of co-regulated expression of their exons. These methods can provide 
more accurate gene nos. and allow the detection of mRNA splice variants and 
identification of the tissue- and disease-specific conditions under which genes are 
expressed. The authors apply the technique to chromosome 22q under 69 exptl. 
condition pairs, and to the entire human genome under two exptl. conditions. The 
authors discuss implications for more comprehensive, consistent and reliable genome 
annotation, more efficient, full-length complementary DNA cloning strategies and 
application to complex diseases. 
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AB AutoGene is a fully automated system that allows for batch processing of 
hundreds of images at a time and also incorporates more sophisticated image anal, and 
statistically reliable data quantification. The system has been designed to fully 
automate image anal, and data quantification operations and answer the need of the 
pharmaceutical drug discovery labs, and academic core facilities. Automatic spot 
finding is its main characteristic and autonomous operation the second. After 
quantification the data can be reviewed and shared by using a software 
ResultsReviewer. 
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TI Tools for analyzing microarray expression data 
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AB Microarray technologies have emerged as key tools for genomic expression anal, 
for the purposes of studying disease states, identifying drug targets, and profiling time- 
, tissue- or stage-dependent changes. The resulting vol. of data generated 



necessitates the use of bioinformatics tools to find interesting gene expression 
patterns, to identify statistically significant changes across expts., and to provide addnl. 
tools relevant to data mining. We have developed a comprehensive workbench soln., 
GeneSpringTM, which (1) comes with an intuitive interface incorporating organized file 
management, (2) handles data from multiple array formats, (3) includes multiple data 
display formats, (4) includes a suite of statistical clustering tools, and (5) incorporates 
automated annotation and cross-referencing. We will discus's the set of algorithms 
collectively designed to facilitate gene function identification from large scale genomic 
expression expts. 
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TI "Gene switch" found with the computer 
AU Grote, Korbtnian 
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AB A brief review with 5 refs., describing transcription complex of the genome, 
detection of single binding sites in the genome, and research on structure of regulatory 
networks and participation of disease-relevant genes within transcription cascades. 
Detection of complex promoter structures and single bindings sites by 
♦♦♦computer*** models and application of ** ♦microarray^ technol. for anal, of 
gene expression are discussed. 
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TI The microarray explorer tool for data mining of cDNA microarrays: application for 

the mammary gland. [Erratum to document cited in CA135:29797] 

AU Lemkin, Peter F.; Thornwall, Gregory C; Walton, (Catherine D.; Hennighausen, 

Lothar 

CS Laboratory of Experimental and Computational Biology, Frederick, MD, 21702, USA 
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DT Journal 
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AB In the last line of the Abstr., the Web address for MAExplorer is incorrect; the 
correct address should be http://www.lecb.ncifcrf.gov/MAExp lorer. 
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TI MAD: a suite of tools for microarray data management and processing 

AU Liao, Birong; Hale, Walker; Epstein, Charles B.; Butow, Ronald A.; Garner, Harold 

R. 

CS Center for Biomedical Inventions, University of Texas Southwestern Medical 
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AB Microarray data management and processing (MAD) is a set of Windows 
integrated software for microarray anal. It consists of a relational database for data 
storage with many user-interfaces for data manipulation, several text file parsers and 
Microsoft Excel macros for automation of data processing, and a generator to produce 
text files that are ready for cluster anal. 
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TI The Microarray Explorer tool for data mining of cDNA microarrays: application for 
the mammary gland 

AU Lemkin, Peter F.; Thornwall, Gregory C; Walton, Katherine D.; Hennighausen, 
Lothar 

CS Laboratory of Experimental and Computational Biology, NCI, FCRDC, Frederick, 
MD, 21702, USA 

SO Nucleic Acids Research (2000), 28(22), 4452-4459 CODEN: NARHAD; ISSN: 0305- 
1048 

PB Oxford University Press 
DT Journal 
LA English 

AB The Microarray Explorer (MAExplorer) is a versatile Java-based data mining 
bioinformatic tool for analyzing quant. cDNA expression profiles across multiple 
microarray platforms and DNA labeling systems. It may be run as either a stand-alone 
application or as a Web browser apple over the Internet. With this program it is 
possible to (i) analyze the expression of individual genes, (ii) analyze the expression of 
gene families and clusters, (iii) compare expression patterns and (iv) directly access 
other genomic databases for clones of interest. Data may be downloaded as required 
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from a Web server or in the case of the stand-alone version, reside on the user's 
computer. Analyses are performed in real-time and may be viewed and directly 
manipulated in images, reports, scatter plots, histograms, expression profile plots and 
cluster analyses plots. A key feature is the clone data filter for constraining a working 
set of clones to those passing a variety of user-specified logical and statistical tests. 
Reports may be generated with hypertext Web access to UniGene, GenBank and other 
Internet databases for sets of clones found to be of interest. Users may save their 
explorations on the Web server or local computer and later recall or share them with 
other scientists in this groupware Web environment. The emphasis on direct 
manipulation of dones and sets of clones in graphics and tables provides a high level of 
interaction with the data, making it easier for investigators to test ideas when looking 
for patterns. The MAExplorer was used to profile gene expression patterns of 1500 
duplicated genes isolated from mouse mammary tissue. The authors identified genes 
that are preferentially expressed during pregnancy and during lactation. One gene we 
identified, carbonic anhydrase III, is highly expressed in mammary tissue from virgin 
and pregnant mice and in gene knock-out mice with underdeveloped mammary 
epithelium. Other genes, which include those encoding milk proteins, are preferentially 
expressed during lactation. MAExplorer may be accessed at 
http:// www. lecb.ncifcrf.gov. MAExplorer. 
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TI General nonlinear framework for the analysis of gene interaction via multivariate 
expression arrays 

AU Kim, Seungchan; Dougherty, Edward R.; Bittner, Michael L; Chen, Yidong; 
Siva ku mar, Krishna moorthy; Meltzer, Paul; Trent, Jeffrey M. 
CS Department of Electrical Engineering, Texas A8iM University, College Station, TX, 
77843-3128, USA 

SO Journal of Biomedical Optics (2000), 5(4), 411-424 CODEN: JBOPFO; ISSN: 1083- 
3668 

PB SPIE-The International Society for Optical Engineering 
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LA English 

AB A cDNA microarray is a complex biochem. -optical system whose purpose is the 
simultaneous measurement of gene expression for thousands of genes. In this paper 
the authors propose a general statistical approach to finding assocns. between the 
expression patterns of genes via the coeff. of detn. This coeff. measures the degree to 
which the transcriptional levels of an obsd. gene set can be used to improve the 
prediction of the transcriptional state of a target gene relative to the best possible 
prediction in the absence of observations. The method allows incorporation of 
knowledge of other conditions relevant to the prediction, such as the application of 
particular stimuli or the presence of inactivating gene mutations, as predictive elements 
affecting the expression level of a given gene. Various aspects of the method are 
discussed: prediction quantification, unconstrained prediction, constrained prediction 
using ternary perceptrons, and design of predictors given small nos. of replicated 
microarrays. The method is applied to a set of genes under-going genotoxic stress for 
validation according to the manner in which it points toward previously known and 
unknown relationships. The entire procedure is supported by software that can be 
applied to large gene sets, has a no. of facilities to simplify data anal., and provides 
graphics for visualizing exptl. data, multiple gene interaction, and prediction logic. 
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TI High throughput and global approaches to gene expression 
AU Ghosh, David 
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SO Combinatorial Chemistry and High Throughput Screening (2000), 3(5), 411-420 
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PB Bentham Science Publishers 

DT Journal; General Review 

LA English 

AB A review with 79 refs. In the past several years, a new set of technologies based 
on whole genome anal, have revolutionized the study of gene expression. These 
microarray or "gene chip" technologies, which arose out of the development of large- 
scale sequencing approaches, are now coming into increasing use, generating a far 
greater vol. of data than the data representing the sequences themselves. This review 
focuses on the current state of development of these technologies, and the available 
approaches to manage and analyze the information they generate. The applicability of 
this technol. to general problems in biomedicine is also discussed. 
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PI WO 2000056762 A2 20000928 WO 2000-US7781 20000322 WO 
2000056762 A3 20020711 W: AE, AL, AM, AT, AU, AZ, BA, BB, BG, BR, BY, 
CA, CH, CN, CR, CU, CZ, DE, DK, DM, EE, ES, FI, GB, GD, GE, GH, GM, HR, HU, 
ID, IL, IN, IS, JP, KE, KG, KP, KR, KZ, LC, LK, LR, LS, LT, LU, LV, MA, MD, 
MG, MK, MN, MW, MX, NO, NZ, PL, PT, RO, RU, SD, SE, SG, SI, SK, SL, TJ, TM, 
TR, TT, TZ, UA, UG, UZ, VN, YU, ZA, ZW, AM, AZ, BY, KG, KZ, MD, RU, TJ, TM 
RW: GH, GM, KE, LS, MW, SD, SL, SZ, TZ, UG, ZW, AT, BE, CH, CY, DE, DK, ES, 
FI, FR, GB, GR, IE, IT, LU, MC, NL, PT, SE, BF, BJ, CF, CG, CI, CM, GA, GN, GW, 
ML, MR, NE, SN, TD, TG AU 2000039154 A5 20001009 AU 2000-39154 
20000322 EP 1235855 A2 20020904 EP 2000-918323 20000322 R: 

AT, BE, CH, DE, DK, ES, FR, GB, GR, rT, U, LU, NL, SE, MC, PT, IE, FI, CY 
PRAI US 1999-273623 A 19990322 WO 2000-US7781 W 20000322 
AB The present invention relates to methods for monitoring differential expression of 
a plurality of genes in a first filamentous fungal cell relative to expression of the same 
genes in one or more second filamentous fungal cells using microarrays contg. 
filamentous fungal expressed sequenced tags. The present invention also relates to 
filamentous fungal expressed sequenced tags and to computer readable media and 
substrates contg. such expressed sequenced tags for monitoring expression of a 
plurality of genes in filamentous fungal cells. DNA sequences are provided for 3770 
ESTs from Fusarium venenatum, 606 ESTs from Aspergillus niger, 4024 ESTs from 
Aspergillus oryzae, and 459 ESTs from Trichoderma reesei. 
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TI Computational simulation of bio-microfluidic processes in integrated DNA biochips 
AU Przekwas, A.; Makhijani, V.; Athavale, M.; Klein, A.; Bartsch, P. 
CS CFD Research Corp, Huntsville, AL, USA 

SO Micro Total Analysis Systems 2000, Proceedings of the .mu.TAS Symposium, 4th, 
Enschede, Netherlands, May 14-18, 2000 (2000), 561-564. Editor(s): Van den Berg, 
Albert; Olthuis, W.; Bergveld, Piet. Publisher: Kluwer Academic Publishers, Dordrecht, 
Neth. CODEN: 69AJPB 
DT Conference 
LA English 

AB Recent developments in mol. bio!, and genetic anal, have inspired strong interests 
in miniaturization of DNA anal, on a single microfluidic chip. In the last few years there 
has been tremendous interest in developing a complete biochem. intelligent 
microsystem for extn., concn., amplification, anal., and processing of DNA. Current 
biochips are being developed in a very conventional exptl. trial and error manner with 
little computational design support. This paper presents a new software tool, CFD- 
ACE+, which has been developed for multidisciplinary, multiscale design of biochips. 
The paper describes the computational physics involved in modeling bio-microfluidic 
devices, and demonstrates it on a biochip for extn., concn. of DNA from fluidic samples 
and on PCR amplification. Other Coprocessing steps such as hybridization and 
electrophoretic sepn. in microfluidic networks on a chip, can also be analyzed with CFD- 
ACE+. 
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TI Gradient gas sensor microarrays for on-line process control - a new dynamic 
classification model for fast and reliable air quality assessment 
AU Menzel, R.; Goschnick, J. 

CS Forschungszentrum Karlsruhe, Institut fur Instrumentelle Analytik, Karlsruhe, D- 
76021, Germany 

SO Sensors and Actuators, B: Chemical (2000), B68(l-3), 115-122 CODEN: SABCEB; 

ISSN: 0925-4005 

PB Elsevier Science S.A. 

DT Journal 

LA English 

AB A dynamic gas classification model was developed to achieve a reliable online 
discrimination at very fast response times. The aim was to be able to follow rapid 
changes in gas compns. using an electronic nose in consumer applications. The 
electronic nose is based on a micro-array specially designed for prodn. at very low 
costs. This is essential for application in mass products. Common classification 
methods used for signal evaluation of electronic noses such as linear discriminant anal. 
(LDA), neural networks (NN), or soft independent modeling of class analogy (SIMCA) 
failed to detect non-stationary gas mixts. However, the new model combines 
classification of steady states with transient evaluation via time series anal. Rapid 
signal transients are detected by appropriate digital filters; steady state signals are 
classified by the above mentioned std. methods. The simplicity of the algorithm model 
allows implementation in low-cost electronic units, contg. micro-controllers with very 
limited memory capacity. To give an example, the automatic control of the ventilation 
flap of automobiles was studied. Intermediate streams of bad air could be detected 
within 1-2 s. The error of pollutant detection was reduced from 25%, applying static 
classification only, to 10% for the new dynamic model. 
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CS Molecular Regulation and Neuroendocrinology Section Clinical Endocrinology 
Branch, National Institute of Diabetes and Digestive and Kidney Diseases, National 
Institutes of Health, Bethesda, MD, 20892, USA 

SO Molecular Endocrinology (2000), 14(7), 947-955 CODEN: MOENEN; ISSN: 0888- 
8809 

PB Endocrine Society 
DT Journal 
LA English 

AB The liver is an important target organ of thyroid hormone. However, only a 
limited no. of hepatic target genes have been identified, and little is known about the 
pattern of their regulation by thyroid hormone. We used a quant, fluorescent cDNA 
microarray to identify novel hepatic genes regulated by thyroid hormone. Fluorescent- 
labeled cDNA prepd. from hepatic RNA of T3-treated and hypothyroid mice was 
hybridized to a cDNA ***microarray*** , representing 2225 different mouse genes, 
followed by ***computer*** anal, to compare relative changes in gene expression. 
Fifty five genes, 45 not previously known to be thyroid hormone-responsive genes, 
were found to be regulated by thyroid hormone. Among them, 14 were pos. regulated 
by thyroid hormone, and unexpectedly, 41 were neg. regulated. The expression of 8 of 
these genes was confirmed by Northern blot analyses. Thyroid hormone affected gene 
expression for a diverse range of cellular pathways and functions, including 
gluconeogenesis, lipogenesis, insulin signaling, adenylate cyclase signaling, cell 
proliferation, and apoptosis. This is the first application of the microarray technique to 
study hormonal regulation of gene expression in vivo and should prove to be a 
powerful tool for future studies of hormone and drug action. 
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TI Genome-wide characterization of the Zaplp zinc-responsive regulon in yeast 
AU Lyons, Thomas J.; Gasch, Audrey P.; Gaither, L. Alex; Botstein, David; Brown, 
Patrick .0.; Eide, David J. 

CS Department of Nutritional Sciences, University of Missouri, Columbia, MO, 65211, 
USA 

SO Proceedings of the National Academy of Sciences of the United States of America 
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DT Journal 
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AB The Zaplp transcription factor senses cellular zinc status and increases expression 
of its target genes in response to zinc deficiency. Previously known Zaplp-regulated 
genes encode the Zrtlp, Zrt2p, and Zrt3p zinc transporter genes and Zaplp itself. To 
allow the characterization of addnl. genes in yeast important for zinc homeostasis, a 
systematic study of gene expression on the genome- wide scale was used to identify 
other Zaplp target genes. Using a combination of DNA ***microarrays*** and a 
***computer*** -assisted anal, of shared motifs in the promoters of similarly 
regulated genes, we identified 46 genes that are potentially regulated by Zaplp. 
Zaplp-regulated expression of seven of these newly identified target genes was 
confirmed independently by using lacZ reporter fusions, suggesting that many of the 
remaining candidate genes are also Zaplp targets. Our studies demonstrate the 
efficacy of this combined approach to define the regulon of a specific eukaryotic 
transcription factor. 
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TI Genome-directed primers for selective labeling of bacterial transcripts for DNA 
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CS Center for Biomedical Inventions and Departmtent of Medicine, University of 

Texas-Southwestern Medical Center, Dallas, TX, 75390-8573, USA 

SO Nature Biotechnology (2000), 18(6), 679-682 CODEN: NABIF9; ISSN: 1087-0156 
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AB DNA microarrays have the ability to analyze the expression of thousands of the 
same set of genes under at least two different expti. conditions. However, DNA 
microarrays require substantial amts. of RNA to generate the probes, esp. when 
bacterial RNA is used for hybridization (50 .mug of bacterial total RNA contains approx. 
2 .mu.g of mRNA). We have developed a computer-based algorithm for prediction of 
the minimal no. of primers to specifically anneal to all genes in a given genome. The 
algorithm predicts, for example, that 37 oligonucleotides should prime all genes in the 
Mycobacterium tuberculosis genome. We tested the usefulness of the genome-directed 
primers (GDPs) in comparison to random primers for gene expression profiling using 
DNA microarrays. Both types of primers were used to generate fluorescent-labeled 
probes and to hybridize to an array of 960 mycobacterial genes. Compared to random- 
primer probes, the GDP probes were more sensitive and more specific, esp. when 
mammalian RNA samples were spiked with mycobacterial RNA. The GDPs were used 
for gene expression profiling of mycobacterial cultures grown to early log or stationary 
growth phases. This approach could be useful for accurate genome-wide expression 
anal., esp. for in vivo gene expression profiling, as well as directed amplification of 
sequenced genomes. 
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TI A ***computer*** program for generating gene-specific fragments for 
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AU Xu, Dong; Xu, Ying; Li, Gary; Zhou, Jizhong 
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SO Frontiers Science Series (2000), 30(Currents in Computational Molecular Biology), 
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PB Universal Academy Press, Inc. 
DT Journal 
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AB A computer program useful in selecting geno-speclfic fragment which can be used 
to design PCR primer pairs for PCR amplification and arrays was developed. 
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TI Automated analysis of multivariate nonlinear gene relations based on cDNA 
microarray expression data 

AU Kim, Seungchan; Dougherty, Edward R.; Bittner, Michael L.; Chen, Yidong; 
Sivakumar, Krishna moorthy; Meltzer, Paul S.; Trent, Jeffrey M. 
CS Dep. Electr. Eng., Texas A8iM Univ., USA 
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AB A cDNA microarray is a complex biochem. -optical system whose purpose is the 
simultaneous measurement of gene expression for thousands of genes. This paper 
describes a general statistical environment for finding assocns. among gene expression 
patterns, and between genes and external conditions, via the coeff. of detn. This 
coeff. measures the degree to which the transcriptional levels of an obsd. gene set can 
be used to improve the prediction of the transcriptional state of a target gene relative 
to the best possible prediction in the absence of observations. Various aspects of the 
method are discussed: prediction quantification, design of predictors given small nos. 
of replicated microarrays, and constrained prediction using ternary perceptrons. A 
main focus is the supporting software and its facilities for data anal, and visualization. 
RE.CNT 18 THERE ARE 18 CITED REFERENCES AVAILABLE FOR THIS RECORD 
ALL CITATIONS AVAILABLE IN THE RE FORMAT 

L6 ANSWER 395 OF 407 CAPLUS COPYRIGHT 2006 ACS on STN 
AN 2000:263599 CAPLUS 
DN 133:160266 

TI Overview of a microarray scanner: design essentials for an integrated acquisition 
and analysis platform 

AU Basarsky, Trent; Verdnik, Damian; Zhai, Jack Ye; Wellis, David 
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AB A review with 15 refs. Data quality form hardware and data anal, and confidence 
measure from software form the basis of a well designed microarray scanner and data 
extn. software system. Successful hardware design is only possible if one has a deep 
understanding and experience of optical and electronic technologies, whereas the 
usability and efficiency of such a system is derived form the tightly integrated 
communication between hardware and software, including optimized algorithms and 
and a thoughtful and easy to use software interface. The final requirement of cost can 
be met by offering the scanner and multiple copies of the acquisition and anal, 
software at a value price point attractive to both academia an industry, as 
accomplished with the GenePix 4000. The future of microarray scanning and anal, can 
be summarized in one word: automation. On the software side, there is not yet an 
anal, package that can ext. the data from a microarray without human intervention, but 
existing software is rapidly approaching this point. Anal, of dataset from multiple 
arrays is already offered, but not as a component of a completely integrated system. 
On both the hardware and software sides, full automation is not far off for integrated 
scanning and anal, systems. 
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AB A series of microarray expts. produces observations of differential expression for 
thousands of genes across multiple conditions. It is often not clear whether a set of 
expts. are measuring fundamentally different gene expression states or are measuring 
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similar states created through different mechanisms. It is useful, therefore, to define a 
core set of independent features for the expression states that allow them to be 
compared directly. Principal components anal. (PCA) is a statistical technique for detg. 
the key variables in a multidimensional data set that explain the differences in the 
observations, and can be used to simplify the anal, and visualization of 
multidimensional data sets. The authors show that application of PCA to expression 
data (where the expti. conditions are the variables, and the gene expression 
measurements are the observations) allows us to summarize the ways in which gene 
responses vary under different conditions. Examn. of the components also provides 
insight into the underlying factors that are measured in the expts. The authors applied 
PCA to the publicly released yeast sporulation data set (Chu et al. 1998). In that work, 
7 different measurements of gene expression were made over time. PCA on the time- 
points suggests that much of the obsd. variability in the expt. can be summarized in 
just 2 components-i.e. 2 variables capture most of the information. These components 
appear to represent (1) overall induction level and (2) change in induction level over 
time. The authors also examd. the dusters proposed in the original paper, and show 
how they are manifested in principal component space. These results are available on 
the internet at htto://vvww.smi.stonford.edu/projects/helix/PCArray. 
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AB The title research of Brown, M.P.S. et al (Proc. Nat'l. Acad. Sci., U.S.A., 2000, 97, 
pg. 262-267) is reviewed with commentary and 6 refs. The impact of microarray 
technol. on biol. will depend on computational methods of data anal. A supervised 
computer-learning method using support vector machines predicts gene function from 
expression data-and shows promise. 
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TI Development of an efficient data processing method for cDNA microarray and its 
application to tissue expression profiling 

AU Kadota, Koji; Miki, Rika; Okazaki, Yasushi; Shimizu, Kentaro; Hayashizaki, 
Yoshihide 

CS Genome Science Lab, Genomic Sciences Center, Ibaraki, 305-0074, Japan 
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AB The authors report the development of a filtering program to selectively ext. genes 
expressed in cDNA microarray s. Probe (tissue mRNA) were prepd. by labeling Cy-3 
dye. The cDNA microarray uses the dual dye system. The authors used Cy-5 labeled 
embryo 17.5 days (whole body) as ref. The algorithm of this filtering program consists 
of 3 steps: (1) omit the results which have flags (flags are built manually when the spot 
image does not fulfill a certain criteria), (2) eliminate spots whose signal intensity is 
less than mean (background signal) + 3.sigma. in both Cy-3 and Cy-5, (3) eliminate 
spots that are located outside the best fit line (least-mean squares). This program was 
applied for the anal, of full-length RIKEN cDNA 20K microarray and analyzed the 
expression profile of normal adult and embryonic tissues. 
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TI Knowledge-based analysis of microarray gene expression data by using support 
vector machines 
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AB We introduce a method of functionally classifying genes by using gene expression 
data from DNA microarray hybridization expts. The method is based on the theory of 
support vector machines (SVMs). SVMs are considered a supervised computer learning 
method because they exploit prior knowledge of gene function to identify unknown 
genes of similar function from expression data. SVMs avoid several problems assocd. 
with unsupervised clustering methods, such as hierarchical clustering and self- 
organizing maps. SVMs have many math, features that make them attractive for gene 
expression anal., including their flexibility in choosing a similarity function, sparseness 
of soln. when dealing with large data sets, the ability to handle large feature spaces, 



and the ability to identify outliers. We test several SVMs that use different similarity 
metrics, as well as some other supervised learning methods, and find that the SVMs 
best identify sets of genes with a common function using expression data. Finally, we 
use SVMs to predict functional roles for uncharacterized yeast ORFs based on their 
expression data. 
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AB A review with 10 refs., on the data management and anal, for gene expression 
profiles produced by DNA microarray technol. Computer software and information 
systems for the anal, of large-scale expression data, cluster anal,, and identification of 
genetic network are discussed. 

L6 ANSWER 401 OF 407 CAPLUS COPYRIGHT 2006 ACS on STN 
AN 1999:53123 CAPLUS 
DN 130:247491 

TI Options available - from start to finish - for obtaining expression data by 
microarray 

AU Bowtell, David D. L. 

CS Peter MacCallum Cancer Institute, Melbourne, 3000, Australia 

SO Nature Genetics (1999), 21(1, Suppl.), 25-32 CODEN: NGENEC; ISSN: 1061-4036 

PB Nature America 

DT Journal; General Review 

LA English 

AB A review, with 35 refs. The excitement surrounding microarray technol. has been 
tempered by the limited ability of the general biomedical research community to gain 
access to it. Given that the hardware required for exploitation of the technol. is 
becoming increasingly available, it is an appropriate moment to review options, be they 
com. or publically available. Here, a snapshot is provided of the rapidly changing field 
of microarray-based RNA expression anal, and the components and procedures for 
putting together a complete system are considered. The complete system is divided 
into sample prepn., array generation and sample anal., and data handling and 
interpretation. 

RE.CNT 35 THERE ARE 35 CITED REFERENCES AVAILABLE FOR THIS RECORD 
ALL CITATIONS AVAILABLE IN THE RE FORMAT 
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TI Adapting the Biomek 2000 Laboratory Automation Workstation for printing DNA 
microarrays 

AU Macas, Jiri; Nouzova, Marcela; Galbraith, David W. 
CS Univ. Arizona, Tucson, AZ, USA 

SO BioTechniques (1998), 25(1), 106, 108-110 CODEN: BTNQDO; ISSN: 0736-6205 
PB Eaton Publishing Co. 
DT Journal 
LA English 

AB The Biomek 2000 Lab. Automation Workstation is used for liq. handling and other 
repetitive operations in many labs. Since it has very good spatial positioning 
capabilities, we have modified this workstation to deliver samples at high densities onto 
microscope slides to produce DNA microarrays. The workstation tool, originally 
designed for bacterial colony replication, was adapted to carry special printing pins and 
was further modified to improve its positional accuracy. Software written in the Tool 
Command Language was concurrently developed to control the movements of the 
workstation arm during the process of printing. With these modifications, the 
workstation can reliably deliver individual samples at a spacing of 0.5 mm, 
corresponding to a total of more than 3000 samples on a single slide. Arrays prepd. in 
this way were successfully tested in hybridization expts. 

RE.CNT 8 THERE ARE 8 CITED REFERENCES AVAILABLE FOR THIS RECORD 
ALL CITATIONS AVAILABLE IN THE RE FORMAT 

L6 ANSWER 403 OF 407 CAPLUS COPYRIGHT 2006 ACS on STN 
AN 1996:445602 CAPLUS 
DN 125:179671 

TI Application of the finite analytic numerical method. Part 1. Diffusion problems on 

coplanar and elevated interdigitated microarray band electrodes 

AU Jin, Baokang; Qian, Weijun; Zhang, Zuxun; Shi, Hansheng 

CS Department of Chemistry and National Key Laboratory of Coordination Chemistry, 

Nanjing University, Nanjing, 210093, Peop. Rep. China 

SO Journal of Electroanalytical Chemistry (1996), 411(1-2), 29-36 CODEN: JECHES; 
ISSN: 0368-1874 
PB Elsevier 
DT Journal 
LA English 

AB Diffusion problems on coplanar and elevated interdigitated microarray band 
electrodes (IDAs) were studied by the finite analytic numerical method (FAM). 
Chronoamperometric curves and steady-state current-potential curves for both 
coplanar and elevated IDAs were simulated for diffusion controlled and quasi-reversible 
systems. The simulated results for coplanar IDAs were compared with those available 
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in the literature, and are in good agreement. The influence of the geometric 
parameters (electrode height he, ratio of electrode width we and gap width wg) of the 
elevated IDAs were also studied. The computing time necessary is much less 
compared with results in the literature. 

L6 ANSWER 404 OF 407 CAPLUS COPYRIGHT 2006 ACS on STN 
AN 1991:666274 CAPLUS 
DN 115:266274 

TI Computer generated microlens arrays and their application to optical free space 

switching networks 

AU Bird, K. D.; Daly, D.; Hall, T. J. 

CS King's Coll., London, UK 

SO IEE Conference Publication (1991), 342(Int. Conf. Hologr. Syst, Compon. Appl., 
3rd, 1991), 57-61 CODEN: IECPB4; ISSN: 0537-9989 
DT Journal 
LA English 

AB The application of microlens arrays into optical switching architectures is 
addressed. The diffractive and refractive properties of microlens arrays were 
demonstrated in their use as efficient optical fan-in/fan-out array generators in the 
Fourier and image plane resp. A flexible and scalable approach to the generation of 
these lens arrays of high quality and uniformity has been detailed. Characterization of 
HOECHST A24620A resist has resulted in the photolithog. prodn. of multilevel 
structures of approximated lens profiles, allowing future control of f-nos. and 
aberrations. This may also be extended to the fabrication of surface relief structures 
for use as alternative fan-out array generators. Future research is into the control of 
these lenses during annealing, their optical qualities and subsequently the integration 
of optical fibers to lens waveguides in switching architectures. Initial insight into the 
tolerances in design are presented. 

L6 ANSWER 405 OF 407 CAPLUS COPYRIGHT 2006 ACS on STN 
AN 1990:107213 CAPLUS 
DN 112:107213 

TI Passivation of pinholes in octadecanethiol monolayers on gold electrodes by 

electrochemical polymerization of phenol 

AU Finklea, Harry O.; Snider, Daniel A.; Fedyk, John 

CS Dep. Chem., West Virginia Univ., Morgantown, WV, 26506-6045, USA 

SO Langmuir (1990), 6(2), 371-6 CODEN: LANGD5; ISSN: 0743-7463 

DT Journal 

LA English 

AB An organized monolayer of octadecanethiol on a Au electrode strongly inhibits 
faradaic reactions except at pinholes in the monolayer. For simpler outer-sphere redox 
couples, the monolayer-coated electrode behaves like a microelectrode array, with 
pinholes acting as the microelectrodes. The av. size and sepn. of the pinholes can be 
estd. by fitting the exptl. cyclic voltammograms with ***simulated*** 
voltammograms for a ***microarray*** electrode. The pinholes are selectively and 
permanently passivated by electrochem. polymn. of phenol in dil. sulfuric acid. The 
deposition of pofy(phenylene oxide) suppresses the pinhole currents at low 
overpotential, but residual faradaic currents become visible at large overpotential. The 
residual currents are assigned to electron tunneling between the electrode and mols. 
which partially penetrate the monolayer. 

L6 ANSWER 406 OF 407 CAPLUS COPYRIGHT 2006 ACS on STN 
AN 1989:162368 CAPLUS 
DN 110:162368 

TI Time and spatial dependence of the concentration of less than 105 microelectrode- 
generated molecules 

AU Licht, Stuart; Cammarata, Vince; Wrighton, Mark S. 

CS Dep. Chem., Massachusetts Inst. Technol., Cambridge, MA, 02139, USA 

SO Science (Washington, DC, United States) (1989), 243(4895), 1176-8 CODEN: 

SCIEAS; ISSN: 0036-8075 

DT Journal 

LA English 

AB The time and spatial dependence of the concn. of as few as 40,000 
electrogenerated, redox-active mols. was detd. The distance between generator and 
detector microelectrodes in an array used in the study could be varied from 0.8 to 28 
.mu.m. Measurements of a sufficiently small ensemble of mols. allowed the exptl. 
results to be compared with a quant, simulation of the random movement of each 
member of the ensemble. The transit time of an electrogenerated species from the 
generator to a collector microelectrode was measured as a function of viscosity, 
diffusivity, and distance. 

L6 ANSWER 407 OF 407 CAPLUS COPYRIGHT 2006 ACS on STN 
AN 1985:413455 CAPLUS 
DN 103:13455 

TI Numerical ***simulation*** of convective diffusion at a ***microarray*** 
channel electrode 

AU Moldoveanu, S.; Anderson, J. L. 

CS Dep. Chem., Univ. Georgia, Athens, GA, 30602, USA 

SO Journal of Electroanalytical Chemistry and Interfadal Electrochemistry (1985), 

185(2), 239-52 CODEN: JEIEBC; ISSN: 0022-0728 

DT Journal 

LA English 

AB Concn. profiles and currents were stimulated for an array electrode on one wall of 
a rectangular flow-through channel, using the backward implicit finite difference 
numerical procedure to solve the equation governing the convective diffusion process. 
Different simplifying hypotheses usually considered in these types of calcns., and their 
concomitant errors were analyzed. A simple criterion for estg. the importance of 
longitudinal and lateral diffusion was developed. The responses of a variety of both 
regular and pseudo-randomly distributed geometries of array electrodes were 
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evaluated under a wide range of conditions. Current response was evaluated as a 
function of the no. of active sites, fractional surface blockage, and flow conditions, 
relative to solid electrodes. The geometrical pattern of the array was found to affect 
the current response, a regularly spaced array yielding the max. response for a given 
degree of partial blockage of 'the electrode and a const, no. of microelectrodes. 
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